Objective: To determine the inter/intra-rater reliability of expert physiotherapists (PTs) measuring post-stroke shoulder pain with 100 mm vertical visual analogue scales (VAS; intensity, frequency and affective response) and a categorical site-of-pain scale. Design: Three PTs independently rated subjects (normal clinical procedure but with a standardized starting position) on three days, at the same time of day, during one week in a randomized order determined by a nested latin square. Reliability for VAS scores was determined with the intraclass correlation coefficient (ICC) and for site-of-pain with the kappa statistic (κ). Acceptable reliability was set at 0.75. The limits of agreement were also calculated. Setting: Community. Subjects:Thirty-three patients, mean time post stroke 42 months (range 7-360). Results: Mean inter-rater reliability was 0.79 for intensity, 0.75 for frequency and 0.62 for affective response (ICC). The limits of agreement were wide and rater bias was significant for 6/27 ratings. Mean intra-rater reliability was 0.70 for intensity, 0.77 for frequency and 0.69 for affective response (ICC). For site-of-pain inter-rater reliability ranged from 0.156 (κ) to 0.385 (κ) and intrarater reliability ranged from 0.300 (κ) to 0.559 (κ). Conclusions: Although inter-rater reliability was acceptable for intensity and frequency there was a consistently large systematic bias between pairs of raters. Agreement might be improved if a standardized assessment procedure was used and/or if training in pain behaviour interpretation was provided.