Default Image
Back to Search Results

Machine Learning & Data Engineer

Location: Zurich
Sector: AI, Machine Learning & Robotics
Job Type: contract
Salary: Negotiable
Reference: BBBH591998

Background:

In Roche's Pharmaceutical Research and Early Development organization (pRED), we make transformative medicines for patients in order to tackle some of the world's toughest unmet healthcare needs. At pRED, we are united by our mission to transform science into medicines. Together, we create a culture defined by curiosity, responsibility and humility, where our talented people are empowered and inspired to bring forward extraordinary life-changing innovation at speed. This position is located in Data Products & Platforms, a chapter within the Data & Analytics function, which pushes boundaries of drug discovery and development, enabling pRED to achieve its goals.

The perfect candidate:

The Machine Learning and Data Engineer will be responsible for the end-to-end development and deployment of a semantic search vector database for research purposes and pRED scientific needs. This role requires a combination of skills in machine learning, data engineering, and software development.

General Information:

* Start date: 1.5.2024
* latest Start Date: 1.7.2024
* Workload: 100%
* Remote/Home Office: hybrid



Tasks & Responsibilities:

* Integrate off-the-shelf open-source embedding models with the system to generate text embeddings from research publications and other text based sources.
* Design and implement the data processing pipeline to handle the conversion of PDF, XML or other files into a suitable format for text embedding.
* Set up and maintain the vector database infrastructure, ensuring efficient storage and retrieval of embeddings.
* Develop and maintain the API for semantic search, allowing for robust querying capabilities.
* Collaborate with stakeholders to gather requirements and ensure the system meets the needs of the organization.
* Conduct testing and quality assurance to ensure the reliability and accuracy of the search results.
* Document the system architecture, API usage, and operational procedures for future reference and maintenance.

Must Haves:

* Strong programming skills, particularly in Python, and experience with machine learning libraries (e.g., TensorFlow, PyTorch) (*****)
* Minimum 7 years Experience with data engineering tasks, including data extraction, transformation, and loading (ETL). (*****)
* Familiarity with vector database technologies (e.g., FAISS, Milvus, Elasticsearch) and database indexing. (*****)
* Knowledge of API development and best practices for scalability and security. (*****)
* Ability to work independently, manage multiple priorities, and communicate effectively with both technical and non-technical stakeholders.
* English fluent

Contact: Alba Jansa, +41 61 282 22 13,

Share This Job

Similar jobs

Get new jobs for this search by email