cGvHD

Refined National Institutes of Health response algorithm for joint- and fascia-associated chronic graft-versus-host disease

Establishment of joint-/fascia-associated graft-versus-host disease (GvHD) occurs in 3–24% of patients with chronic GvHD (cGvHD).1 The consequences of joint/fascia involvement can be hugely debilitating, with symptoms including joint stiffness and restricted movement, limb tightness, edema, and subcutaneous sclerosis/fasciitis. In 2005, the National Institutes of Health (NIH) applied response criteria for joint/fascia GvHD, with the aim to characterize GvHD organ involvement.2 Subsequently, in 2014, a response algorithm with updated recommendations was implemented.1 Following establishment of the aforementioned response criteria, ibrutinib became the first approved treatment for the indication in 2017.3

The 2014 criteria suggested that a decrease in NIH joint/fascia score or an increase in photographic range of motion (P-ROM) score by ≥ one point at any site be classed as an improvement. On the other hand, progression is classified by an increase in NIH score by ≥ one point (including from 0–1) or a decrease in P-ROM score by ≥ one point. Although the 2014 algorithm has been imperative for advances in treatment of cGvHD, its real-world use has been challenged. Firstly, opposing changes in joints are considered overall progression, but the appropriateness of this has not been evaluated. Furthermore, a change from 0–1 on the NIH organ scoring algorithm is not considered progression in most other GvHD involvement sites, and there is no evidence supporting the exception of joint/fascia.  Finally, there are inconsistencies between NIH and P-ROM scores, with occasions of one improving and the other exacerbating.1

Yoshihiro Inamoto, National Cancer Center Hospital, Tokyo, JP, aimed to address these limitations and uncertainties, and we hereby present a summary of the multicenter evaluation of the performance of the 2014 NIH response algorithm for joint/fascia GvHD.1

Study Design

  • Prospective, multicenter, longitudinal, observational study of two independent study cohorts by the Chronic GvHD Consortium
  • Clinically meaningful changes were defined by the distribution method (half a standard deviation [SD]) and anchor-based methods (changes in the measures that correlated with patient- or clinician-reported changes in joint/fascia involvement)

Patients

  • Adult patients (≥ 18 years) with systemically treated cGvHD were recruited from 2007–2012 (training cohort; n = 488) and 2013–2017 (replication cohort; n = 357)
  • Diagnosis and assessment of cGvHD in the training and replication cohorts were made according to the 2005 and 2014 NIH consensus criteria, respectively
  • Training cohort:
    • 209 patients presenting with joint/fascia involvement in ≥ one visit
    • Total visits: 1,578
  • Replication cohort:
    • 191 patients presenting with joint/fascia involvement in ≥ one visit
    • Total visits: 1,195
    • Eight patients (2%) and 93 visits (7%) had joint/fascia abnormalities not caused by GvHD and were excluded
  • cGvHD organ involvement and manifestations were reported by patients and clinicians at enrollment and every six months thereafter

Patient characteristics

  • Table 1 illustrates patient characteristics at the time of enrollment

Table 1. Patient characteristics of the training and replication cohorts at the time of enrollment2

Characteristic

Training Cohort

Replication Cohort

p*

Total, n

209

191

 

Median time from HCT to enrollment, months (range)

13.5 (3.4–37.3)

25.2 (3.4–332)

< 0.001

Patient age at enrollment, years (range)

52 (19–79)

55 (19–77)

0.18

Patient sex, male

119

122

0.18

Stem cell source

Bone marrow

Mobilized blood cells

Cord blood

Female donor to male recipient

 

12

185

12

57

 

8

179

4

58

0.14

HLA and donor type

Matched related

Matched unrelated

Mismatched

 

101

85

23

 

70

94

27

0.06

Conditioning regimen

Myeloablative

Nonmyeloablative/RI

Unknown

 

106

101

2

 

89

100

2

0.74

Involved site at enrollment

Skin

Eye

Mouth

Liver

GI tract

Joint/fascia

Lung

Genital tract

 

138

108

112

34

63

113

57

20

 

157

114

106

18

54

155

76

27

 

< 0.001

0.11

0.76

0.05

0.74

< 0.001

0.01

0.005

NIH global score at enrollment

Mild

Moderate

Severe

 

23

131

55

 

14

72

105

 

 

 

P-ROM score in all visits,

mean ± SD

Shoulder

Elbow

Wrist

Ankle

Total score

 

 

6.62 ± 0.74

6.69 ± 0.72

6.26 ± 1.17

3.59 ± 0.57

23.2 ± 2.34

 

 

6.40 ± 0.89

6.52 ± 0.83

5.93 ± 1.39

3.49 ± 0.69

22.4 ± 2.97

 

 

< 0.001

< 0.001

< 0.001

0.04

< 0.001

GI, gastrointestinal; HCT, hematopoietic cell transplant; HLA, human leukocyte antigen; NIH, National Institutes of Health; P-ROM, photographic range of motion; RI, reduced intensity; SD, standard deviation

*Bold font denotes statistical significance

 Results

  • One half of a SD in the total P-ROM scores for the training and replication cohorts were 1.17 and 1.49, respectively, suggesting that a two-point change in total P-ROM score is clinically relevant

Divergent* response in individual P-ROM scores

  • In the training cohort, 455 paired visits revealed joint/fascia manifestations, be it in the previous or most current visit
  • Worse individual P-ROM scores occurred in 15–21% of paired visits
  • Where individual P-ROM scores were used to calculate overall response, 26% elicited improvement, 32% stability, and 43% worsening†
  • In contrast, when assessed using a total P-ROM score, 88% of cases were classified as stable
  • Divergent responses were observed in 12% of the visits

*Divergent responses were classified as improvement in one joint but worsening in another with reference to individual P-ROM scores

†Worsening in any joint was considered overall worsening, even when other joints are improved

Changes in NIH joint/fascia scores from 0–1

  • Sixty-three of the 455 paired visits in the training cohort elicited a change in NIH joint/fascia score from 0–1, while their total P-ROM score did not worsen
  • Of these 63 visits, very few patients or clinicians perceived the joint condition as worse (Table 2), suggesting that a change from 0–1 NIH score should not be considered as worsening

Table 2. Overall assessment for 63 paired visits with change in NIH joint/fascia score from 0–1 without worsening in total P-ROM score2

Measure

Improved, n (%)

Stable, n (%)

Worse, n (%)

Clinician perception

32 (51)

29 (46)

2 (3)

Patient perception*

18 (34)

31 (58)

4 (8)

*Patient perception not available in ten paired visits

Contrasting NIH and total P-ROM scores

  • Thirteen of the 455 paired visits showed divergence between NIH joint/fascia score and total P-ROM score
  • Clinicians perceived these visits to be mainly stable (54%) or improved (38%), whereas patients perceived their condition as mostly stable (44%) or worse (33%)
  • Although divergence between NIH and overall P-ROM scores is rare, the results from these 13 visits suggest that overall response cannot be determined in these cases

Refined response algorithm for joint/fascia GvHD

  • A refined algorithm for the assessment of joint/fascia GvHD involvement was based on data from the training cohort and is summarized in Table 3

Table 3. Refined response algorithm for cGvHD2     

Subscore

Improve

Stable

Worse

NIH

joint/fascia score

Decrease by

≥ one point

No change, or change from 0–1

Increase by ≥ one point (except for the change from 0–1)

Total P-ROM score

Increase by

≥ two points

Change ≤ one point

Decrease by ≥ two points

Overall assessment algorithm

 

 

NIH joint/fascia score

Improve

Stable

Worse

Total

P-ROM

score

Improve

Improve

Improve

Uninterpretable

Stable

Improve

Stable

Worse

Worse

Uninterpretable

Worse

Worse

NIH, National Institutes of Health; P-ROM, photographic range of motion

  • In both the training and replication cohorts, the proportion of visits with worsening joint/facia GvHD involvement decreased from around 50% to < 20% when using the 2014 NIH and refined algorithms, respectively
  • Joint/fascia GvHD involvement was classed as uninterpretable in the case of divergent NIH and total P-ROM scores in the training (3%) and replication (2%) cohorts
  • Reclassification following assessment using the refined algorithm occurred in 40% of the training cohort and 35% of the replication cohort
    • Many of the cases previously interpreted as worsened were reclassified as improved or stable

Conclusion

  • The study provides an improved algorithm for assessing joint/fascia cGvHD involvement that can be used in clinical trials
    • By assessing ½ SDs, it was advised that a two-point change in total P-ROM score is clinically relevant
    • In line with the 2014 NIH criteria scoring for other sites, a change from 0–1 in NIH joint/fascia score should not be considered as worsening
    • Overall response should be defined as uninterpretable when rare divergent responses remain between NIH joint/fascia score and total P-ROM score
  • When following the refined algorithm, the level of visits reporting worsening of joint/fascia GvHD involvement was around 30% lower in both cohorts compared to the 2014 criteria
References
  1. Inamoto Y. et al. Refined National Institutes of Health response algorithm for chronic graft-versus-host disease in joints and fascia. Blood Adv. 2020 Jan 3; 4(1):40–46. DOI: 10.1182/bloodadvances.2019000918
  2. Lee S.J. et al. Measuring Therapeutic Response in Chronic Graft-versus-Host Disease. National Institutes of Health Consensus Development Project on Criteria for Clinical Trials in Chronic Graft-versus-Host Disease: IV. The 2014 Response Criteria Working Group Report. Biol. Blood Marrow Transplant. 2015 Jun 1; 21(6):984–999. DOI: 10.1016/j.bbmt.2015.02.025
  3. Miklos D. et al. Ibrutinib for chronic graft-versus-host disease after failure of prior therapy. Blood. 2017 Nov 23; 130(21):2243–2250. DOI: 10.1182/blood-2017-07-793786
Download this article:

You can now download this article in Adobe PDF® format.

Download as PDF
Was this article informative? Thank you for your feedback!