Research Project · GEM Lab · The University of Hong Kong

MNotation Machine + annotation

An interactive annotation platform that helps qualitative researchers code interviews, perform thematic analysis, and build reliable label sets — through three complementary modes: Manual, LLM-assisted, and Active Machine Learning.

About the Project

MNotation (short for Machine + annotation) is a research platform developed at the 💎 GEM Lab, The University of Hong Kong. It is designed to support qualitative coding workflows — especially interview transcripts, thematic coding, and other open-text annotation tasks — by combining traditional manual labelling with modern AI assistance and human-in-the-loop active learning.

The system was built for classroom and workshop settings: researchers can import a dataset, configure a label taxonomy, and invite collaborators to annotate in three distinct modes — comparing speed, agreement, and label quality across human, LLM, and active-learning workflows.

Three Annotation Modes

Each mode is purpose-built for a different stage of the qualitative coding pipeline.

Manual coding mode demo 01

Manual Coding Mode

Pure human annotation — the ground truth.

Researchers code each unit one-by-one against a configurable taxonomy. Designed for mobile and desktop, with single-tap label selection, instant progress feedback, undo support, and live synchronisation to the moderator dashboard.

  • Configurable label taxonomy
  • Instant submit & undo
  • Real-time progress dashboard
LLM-assisted coding mode demo 02

LLM-Assisted Mode

AI predicts, human verifies.

Each unit is pre-labelled by a large language model using researcher-defined prompts (zero-shot and few-shot). Annotators can Accept, Override, or Customise the prompt — capturing both the prediction and the human correction for downstream analysis.

  • Zero-shot & few-shot prompts
  • One-tap accept / override
  • Captures human–AI disagreement
Active machine learning mode demo 03

Active Machine Learning Mode

The system chooses what to label next.

Our ED-AL v1 algorithm combines Shannon entropy (uncertainty) with TF-IDF k-center greedy selection (diversity) to pick the most informative samples for human review — getting more reliable labels with far less annotation effort.

  • Entropy-based uncertainty sampling
  • k-center diversity selection
  • Transparent "why this sample?" rationale

Built for Qualitative Research

MNotation is primarily used for qualitative coding tasks, including:

Interview Transcripts

Code utterance-level data from semi-structured interviews.

Thematic Coding

Develop and apply codebooks across qualitative datasets.

Open-Ended Survey Responses

Cluster, label, and verify free-text answers at scale.

Classroom Discourse

Annotate tutoring dialogues and learner conversations.

Xianghui Meng

Researcher

💎 GEM Lab, The University of Hong Kong

Yujing Zhang

Researcher

💎 GEM Lab, The University of Hong Kong

Jionghao Lin Principal Investigator

Assistant Professor · Lab Director

💎 GEM Lab, The University of Hong Kong

About 💎 GEM Lab

GEM stands for Get Everyone Moving and Guiding AI in Education: Methods and Applications. Established in 2024, the lab brings together researchers from HKU, CMU, and beyond to study how generative AI and learning analytics can support — rather than replace — human educators and researchers.

Visit GEM Lab →

Verify Literature With Confidence

While MNotation focuses on coding interviews and other qualitative data, our upcoming SLR System brings the same philosophy — AI suggests, human verifies — to literature annotation. Upload your PDFs, apply your coding scheme, and verify results with full evidence traceability.

Preview the SLR System →

Key Capabilities

📚

Batch PDF Processing

Upload up to 50 PDFs or a ZIP archive for systematic analysis in one session.

🤖

AI-Powered Coding

Automatically apply your pre-defined coding scheme to identify themes across literature.

🔍

Evidence Extraction

Surface relevant passages that support coding decisions with precise document navigation.

Critical Verification

Review AI suggestions transparently — you retain control of every coding decision.

Two Verification Modes

Theme Verification

AI automatically applies your coding scheme to each document. Review the suggested labels side-by-side with the original PDF and adjust as needed.

  • Auto-generated code labels
  • Side-by-side PDF view
  • Confidence scores

Evidence Verification

AI surfaces relevant evidence passages from the document. Click any evidence to jump to its location in the PDF. Rate whether each piece of evidence supports your coding decision.

  • Evidence-based review
  • Click-to-locate in PDF
  • Yes / No feedback + notes

Stay Tuned

The SLR System is under active development. To be notified when it launches — or to participate in early testing — please reach out via our Contact page.

Principal Investigator

Dr. Jionghao Lin

💎 GEM Lab · The University of Hong Kong

Faculty Homepage →

Project Inquiries

For demos, collaborations, or library workshops:

jionghao@hku.hk

Please mention "MNotation" in the subject line so we can route your message to the right team member.

Lab

💎 GEM Lab

Faculty of Education
The University of Hong Kong

Visit the lab website →

Related Project

SLR System

Annotation Tool for Literature Review

slr-system.pages.dev →