Skip to main content

LLM Pilot Training Project 2026

Overview

The development of large language models (LLMs) has opened many new possibilities for the use of AI in analysing data, though there are many open questions, including:

  • accuracy and error;
  • privacy;
  • skills.

However, despite these concerns, this is a technology that is under-explored in DS-I Africa and which many groups would benefit from its use.

The use of public LLMs such as ChatGPT, Claude and DeepSeek are of course very important. However, these are not always possible to use in our projects because the sensitivity of the data means that providing the data to public LLMs is too risky or too complex to get regulatory approval. The alternative – to run these models either in the cloud or locally – is very attractive.

Participation is open to data scientists and trainees in the DS-I Africa Consortium and partners. Participants must be competent programmers.

Skill level of training: Intermediate

Language: English

Credential awarded: Certificate of Attendance

Type of training:

Phase A: Virtual Course (Zoom) Phase B: Hackathon

Venue: Hackathon
Professional Development Hub (PDH), University of the Witwatersrand, Johannesburg, South Africa

Course dates:

Phase A (online):
Weekly on Tuesdays 10 February - 7 April as well as Thursdays 12 February and 26 March 15h00 – 16h30 CAT, subject to change.

Phase B (hackathon):
13-17 April 2026
08h30 -16h30 CAT

Application opening date: 10 December 2025

Application closing date: 19 January 2026 at 23:59:59 CAT

Notification date for successful applicants: 26 January 2026

Organisers:
Scott Hazelhurst, Sumir Panji, Michelle Skelton, Kerry Glover, Tshinakaho Malesa, Shaun Aron, Atwine Mugume, Helen Robertson, Ndivhuwo Makondo


Sponsors:
MADIVA, eLwazi Open Data Science Platform, DS-I Africa Consortium

Intended Audience

The course is aimed at graduate students and scientists who are currently working on data science projects in Africa, with preference given to DS-I Africa Consortium members and partners.

Pre-requisites

  • Competent programmer
  • Your own laptop
  • Unix terminal or Windows Subsystem for Linux (WSL)
  • Command-line knowledge and experience with working with LLMs
  • Project support and own project funding for in-person Hackathon travel (see below)

Funding

The Hackathon organisers will cover the venue, a return daily shuttle to/from Rosebank Holiday Inn, daily lunch, and refreshment breaks during the hackathon days.

All other expenses, including travel, accommodation, airport transfers, visas, and vaccinations, must be covered by your project/PI.
Project/PI support is required for your attendance.

Curriculum

The Phase A training component is virtual, although we encourage DS-I Africa projects / individuals to form in person study groups to enable peer to peer learning and develop teamwork skills in virtual classrooms. We would encourage the model of having a TA in each such classroom.

Phase A training will comprise 10 sessions, each 90 minutes long. Some of the sessions may have a practical component/project that participants are expected to complete.

  1. Introduction to LLMs, overview of existing LLMs
  2. Theory of LLMs: part 1
  3. Theory of LLMs: part 2
  4. Using an API to interact with an LLM
  5. Programming using LLMs – best practices
  6. Programming using LLMs: case study
  7. Ethics and legal issues
    • Bias, privacy
    • Confidentiality and data leakage
  8. Introduction to running LLMs locally
    • Overview of different options
    • Pros/cons of running locally versus cloud, LLM pragmatics
    • Approaches to running locally (e.g., fine-tuning, RAG)
  9. Running LLMs locally: practical exercise
  10. Critical assessment and reflection

Learning outcomes

After this course participants should be able to:

  • Define the fundamental architecture and components of large language models, including transformers, attention mechanisms, and tokenisation
  • Compare and contrast various LLM models;
  • Identify and assess the ethical considerations related to LLM use, including bias, privacy, confidentiality, and data leakage;
  • Identify appropriate use cases for public LLMs versus locally-run models based on data sensitivity and regulatory requirements, and cost;
  • Implement API calls to interact with LLMs programmatically for data analysis tasks;
  • Apply best practices for prompt engineering and programming with LLM, including validation of results;
  • Configure and deploy local LLM instances using appropriate tools and frameworks, including customisation and fine-tuning;
  • Assess LLM outputs for accuracy, reliability, and potential errors in scientific data analysis contexts;
  • Design and implement a complete LLM-based solution for a real-world data science problem in the DS-I Africa consortium;
  • Develop custom workflows that integrate LLMs into existing data analysis pipelines;
  • Critically advocate for responsible and ethical use of LLMs in African research contexts

Limitations

This course provides a foundation for continued learning in research using LLMs and current practices that rapidly change.

To apply, please click HERE

DS-I Africa Coordinating Centre
Address: Faculty of Health Sciences, University of Cape Town,
Anzio Road, Observatory, 7925
Email: ds-i-africa.admin@uct.ac.za
Website: dsi-africa.org Tel: +27 (0)21 650 1509

Date: - CAT

Add to Outlook.com Add to Google Calendar