Data Science Intern (NLP and Transformers)

Open Positions: 3
Apply Latest By: 30.05.2024

We are seeking a highly motivated Data Science Intern with a strong background in fine-tuning transformer models for NLP tasks, particularly in transcription and translation. The intern will work closely with our advanced AI team to develop, evaluate, and deploy state-of-the-art models that improve our product offerings and contribute to cutting-edge research in AI.

Responsibilities & Duties

  1. Assist in the development and fine-tuning of transformer models for various NLP tasks.
  2. Collaborate with the team to integrate models into our existing systems using frameworks such as TensorFlow, PyTorch, and Hugging Face Transformers.
  3. Contribute to the evaluation and benchmarking of models using appropriate datasets and metrics.
  4. Implement and manage experiments using tools like LangChain, Llama Index, and other relevant technologies.
  5. Develop interfaces and applications using Streamlit to demonstrate model capabilities.
  6. Engage in prompt engineering and the deployment of open-source LLM models such as Llama, Gemma, and Mistral.
  7. Document and report findings and methodologies in clear and comprehensive ways.

Required Experience, Skills and Qualifications

  1. Currently enrolled in or recently graduated from a graduate program in Computer Science, Data Science, Artificial Intelligence, or a related field.
  2. Demonstrable experience with machine learning, specifically in fine-tuning transformer models for NLP.
  3. Proficiency in programming languages such as Python and libraries/frameworks such as TensorFlow, Pandas, and Hugging Face Transformers.
  4. Experience with NLP tasks like transcription and translation is highly preferred.
  5. Familiarity with open-source LLM models (e.g., Llama, Gemma, Mistral) and their ecosystems.
  6. Strong analytical skills, attention to detail, and the ability to work independently and collaboratively.
  7. Excellent problem-solving skills and a passion for innovative technologies.