BGM-HAN: A Hierarchical Attention Network for Accurate and Fair Decision Assessment on Semi-Structured Profiles: what it means for business leaders

BGM-HAN parses grades, achievements, and essays in tandem, then uses fairness-aware hierarchical attention to generate transparent admission or hiring scores—helping committees meet equity targets while slashing manual review time.

1. What the method is

BGM-HAN combines byte-pair tokenisation with three-level attention that highlights pivotal tokens, sentences, and data fields. Gated residual blocks fuse quantitative and narrative cues, and a fairness head adjusts for observed demographic skew, outputting calibrated acceptance probabilities plus saliency maps ready for audit.

2. Why the method was developed

Selection panels confront thousands of semi-structured profiles while regulators demand equal-opportunity proof. Tabular models miss context; LLMs are costly and opaque. BGM-HAN bridges that gap, delivering speed, accuracy, and built-in bias mitigation with clear, attention-based explanations.

3. Who should care

University provosts, scholarship boards, enterprise talent-acquisition leads, and compliance officers in regulated credit or insurance lines—all organisations that rank large applicant pools under fairness scrutiny—stand to gain from BGM-HAN’s interpretability and bias controls.

4. How the method works

A 5 k-subword BPE vocabulary tokenises inputs, which are batched into 10 × 50 grids. Token-level attention forms sentence vectors; sentence-level attention builds field embeddings; a final field-attention layer yields a profile vector. Cross-entropy plus equality-of-opportunity regularisers guide training on demographically stratified mini-batches, while attention weights are aggregated for human-readable rationales.

5. How it was evaluated

On a 42 k-applicant Singapore admissions dataset, BGM-HAN was benchmarked against gradient-boosted trees, logistic regression, RoBERTa-large, and GPT-4. Metrics included macro-F1, demographic-parity gap, latency, and reviewer trust from a double-blind survey; ablations removed residuals, fairness loss, or hierarchical pooling.

6. How it performed

The model beat the best baseline by 5.8 pp macro-F1, cut demographic-parity gap from 0.12 to 0.04, and processed profiles in 18 ms—20× faster than GPT-4. Reviewers accepted its explanations 87 % of the time. (Source: arXiv 2507.17472, 2025)

← Back to dossier index