Job Summary
We are a leading UK-based provider of innovative software solutions committed to delivering advanced technologies that drive efficiency, automation, and intelligent decision-making across various industries. We are currently seeking a skilled AI Expert to join our dynamic team. This role presents an exciting opportunity to develop and integrate state-of-the-art AI capabilities, playing a pivotal role in shaping the future of smart software solutions.
Key Responsibilities
As an AI Expert, you will be responsible for designing and deploying large language model (LLM)-powered tools, including chat assistants and automation modules. You will implement AI and machine learning pipelines to enable scalable inference deployment, ensuring robust and efficient AI service delivery.
Collaboration is key in this role, as you will work closely with backend developers to integrate AI APIs and manage background processing tasks. A critical part of your work will involve optimizing model performance and reducing latency, especially in environments with limited GPU resources.
You will explore, fine-tune, and customize open-source models such as LLaMA, Mistral, and SQL coders to meet specific project requirements. Monitoring system performance, user session usage, and resource allocation across multiple deployments will be essential to maintain operational excellence.
Additionally, you will contribute to the AI product roadmap, influence architectural decisions, and help establish best practices for AI deployment within the organization.
Required Qualifications
Candidates must demonstrate proven hands-on experience with large language models (LLMs), natural language processing (NLP), and custom model training or fine-tuning. Strong proficiency in Python and deep learning frameworks such as PyTorch or TensorFlow is essential.
Familiarity with AI development tools and libraries, including Hugging Face, LangChain, LLM orchestration, and prompt engineering techniques, is required. Experience deploying AI solutions in AWS or Linux-based environments using Docker containers is also necessary.
You should be skilled in integrating high-performance LLM serving frameworks like vLLM or Text Generation Inference (TGI) into backend APIs, supporting parallel and batched inference processes. Working knowledge of relational and in-memory databases such as MySQL/MariaDB and Redis, along with managing job queues, is expected.
The ability to build and manage GPU-efficient inference workloads to maximize resource utilization is critical for success in this role.
Preferred Qualifications and Benefits
Knowledge of multi-tenant inference services or load balancing strategies for AI APIs will be considered an advantage. Experience in healthcare AI applications or analytics dashboards is a plus, reflecting our interest in expanding into specialized sectors.
This is a full-time, permanent position based at our office, offering a competitive salary ranging from Rs150,000 to Rs300,000 per month, commensurate with experience.
If you are passionate about advancing AI technologies and eager to contribute to impactful, innovative software solutions, we encourage you to apply and join our forward-thinking team.