Posted 2 months ago

يمكن تكون بداية فصل جديد في حياتك

Engineer – Machine Learning

Dubai, United Arab Emirates Egypt
November 15, 2025
IT/Software Development Expired

Position Details

Location

Dubai, United Arab Emirates Egypt

Posted Date

November 15, 2025

Employment Type

IT/Software Development

متوسط الراتب

Confidential

Job Description

Presight -

Company: Presight –

WebSite: Dubai, United Arab Emirates

Job Description:**Subject: LLM Ops Engineer Position at Presight**

Presight, a public company listed on the ADX, and majority-owned by Abu Dhabi’s G42, is a leading regional provider of big data analytics solutions powered by Artificial Intelligence (AI). The company leverages its expertise in big data, analytics, and AI to deliver value across diverse sectors, fostering both business advancement and positive societal impact. Presight’s core capabilities reside in its sophisticated computer vision, AI, and omni-analytics platform, enabling comprehensive data interpretation and facilitating data-driven decision-making in support of policy formulation and the creation of safer, healthier, more sustainable societies.

An opportunity exists for a highly qualified LLM Ops Engineer to spearhead the deployment, scaling, monitoring, and optimization of large language models (LLMs) across heterogeneous environments. This role is integral to ensuring the production readiness, optimal performance, and resilience of our machine learning systems. The successful candidate will possess demonstrated expertise in Python programming, a thorough understanding of LLM internals, and practical experience with various agentic frameworks, inference engines, and deployment methodologies. This position provides a unique opportunity to contribute to the advancement of cutting-edge AI technologies within a dynamic and collaborative setting.

**Responsibilities:**

* Design, deploy, and scale LLM infrastructure across cloud and on-premises environments, encompassing GPU clusters, containers, and Kubernetes orchestration, to guarantee high performance, reliability, and fault tolerance.
* Develop and optimize inference pipelines for low-latency, high-throughput model serving utilizing frameworks such as Triton Inference Server, vLLM, or TensorRT.
* Manage CI/CD pipelines, AI microservices, embeddings storage, and MCP servers, ensuring secure, production-ready deployment of models and tool integrations.
* Deploy and maintain agentic AI frameworks (e.g., Dify, LangFlow) and LLM gateways to manage traffic, enforce audit/compliance controls, and integrate with IAM systems.
* Monitor performance, cost, and resource utilization; implement optimization strategies for GPU, CPU, and storage efficiency while maintaining scalability and reliability.
* Conduct hardware sizing and capacity planning to meet current and projected LLM workload requirements.
* Collaborate with data scientists and engineers to operationalize models and workflows into production-grade systems.
* Develop and maintain comprehensive documentation, runbooks, and deployment playbooks to facilitate knowledge transfer and operational consistency.
* Maintain awareness of emerging LLM techniques, including quantization, distillation, distributed inference, and best practices for production deployments.
* Troubleshoot and resolve production issues, proactively improving infrastructure for enhanced stability, scalability, and maintainability.

**Qualifications:**

* Bachelor’s or Master’s degree in computer science, machine learning, or a related discipline, complemented by a minimum of two years of experience in ML Ops, DevOps, or ML infrastructure, including the production deployment of ML/LLM workloads.
* Proficiency in Python and scripting, with demonstrable experience in containerization, orchestration (Docker, Kubernetes, Helm), CI/CD pipelines, monitoring, and observability for ML systems.
* Expertise in GPU cluster management, distributed inference, high-performance model serving, and scalable, fault-tolerant architectures.
* Competence in cloud/hybrid environments (AWS, GCP, Azure, on-prem), with a comprehensive understanding of security protocols, access control mechanisms, and compliance mandates.

**Preferred Qualifications:**

* Experience in deploying and maintaining agentic AI frameworks (e.g., Dify, LangFlow) and MCP servers for LLM-tool integration.
* Familiarity with LLM orchestration, RAG pipelines, API integrations, and distributed inference frameworks (e.g., Ray).
* Expertise in hardware sizing, capacity planning, cost optimization, and infrastructure-as-code tools (Terraform) for large-scale ML/LLM deployments.
* Hands-on experience with LLM optimization techniques (quantization, distillation, compression).
* Understanding of compliance and governance standards (ISO, NIST) for operational AI systems.

Presight seeks candidates who are performance-driven, possess intellectual curiosity, and exhibit adaptability to ambiguous situations. Ideal candidates will be eager to engage in meaningful collaboration with stakeholders and committed to developing unique, customer-centric solutions. A bias for action and a passion for innovation are core values within the Presight community.

Skills:

share :

نصائح تهمك

  • جهز CV حديث قبل التقديم
  • تأكد من تحديث معلومات التواصل في سيرتك الذاتية
  • اقرأ وصف الوظيفة بعناية قبل التقديم
  • جهز رسالة تغطية مخصصة للوظيفة
  • تأكد من صحة جميع المعلومات في طلبك
  • احفظ نسخة من طلب التقديم
  • تابع بريدك الإلكتروني بانتظام
  • جهز نفسك للمقابلة الشخصية مسبقاً

المزيد من الوظائف في هذه الفئة

عرض المزيد من وظائف IT/Software Development
Apply Now