Data Engineer*in
Data Engineer
Festanstellung, Voll- oder Teilzeit · Berlin, Remote
Who we are
Are you passionate about software quality, cutting-edge tech and healthcare? We're a dynamic Berlin-based medtech startup on a mission to revolutionize radiology with AI-powered software. Our innovative solutions automatically analyze MRI scans, enabling clinicians to make faster, more precise, and objective assessments.
We take pride in our three CE-certified products, mdbrain, mdknee, and mdprostate, which have become market leading across Europe. Join our team and be a part of a company that's transforming the future of medical diagnostics.
Your mission
-
You will work in close collaboration with the AI engineers and software engineers to build a large and efficient database of annotated MRI data sets, which is compliant with healthcare regulations (e.g. GDPR)
-
You will implement and maintain efficient data pipelines: From importing, anonymizing and cleansing fresh data from our clients, over providing these data to our annotators for labeling (segmentation, bounding boxes, …), to storing annotated data sets in a structured way that makes them super easy to query and analyze for your fellow AI engineers
-
You will set up and maintain monitoring systems for the data pipelines in the different stages, alerts for critical behavior and dashboards to visualize the imported data parameters and distribution
-
You will make sure that our data bases are in a consistent state and ensure a high quality of the data sets
-
You will work with our on-prem installations on the linux servers of our customers, who utilize a range of access methods (VPN, SSH, TeamViewer) to connect, develop, and run software that securely exports their data.
Your profile
-
3+ years of work experience (industry or academia) in a data engineering role (less experience is also acceptable for junior candidates) and proficiency in ETL and/or ELT processes
-
Masters or Ph.D. in Computer Science or a related field
-
You know how to handle large data sets in different formats and from heterogeneous sources
-
Experience with relational and non-relational databases or data lakes
-
Experience with handling medical images or other 3D image data sets, or a passion for it
-
Experience with docker and with cloud computing (e.g. on AWS)
-
You have experience working with common task management frameworks like Airflow or Luigi
-
Experience with monitoring tools and dashboards such as Prometheus, Nagios, Grafana or Metabase
-
Strong coding skills in Python and solid general engineering skills: You know about version control, unit testing and continuous integration, and you value best practices and good code quality
-
You have experience with Linux and Bash and are comfortable writing and running small scripts that can handle data exporting jobs
-
Good communication skills and fluent in English
-
Located in Germany
Nice to have:
-
Full-stack and particular frontend skills (Javascript/Typescript, maybe even in the context of an image annotation tool)
-
Experience with DICOM and NIFTI file formats
-
Good IT / system administration knowledge, including remote access methods like VPN, SSH or TeamViewer and secure data transfer protocols like SCP
-
Understanding of machine learning algorithms
Why us?
-
Our vision has a purpose: Your work has a direct effect on the quality of our AI products and thus contributes to make the diagnosis of medical conditions faster, more accurate and more accessible
-
You will work on interesting data sets consisting of radiologic images of thousands of patients
-
You will play a key role in shaping our data infrastructure from the ground up
-
You will join a highly interdisciplinary team of motivated and highly collaborative data scientists, engineers, physicists and medical doctors with rapid decision-making processes
Mention baito
You like what we are doing? You can support us by mentioning that you found this job on baito.