Customer vulnerability FAQs: outcomes

How do you measure outcomes around managing vulnerable customers and Consumer Duty?

Consumer Duty has shifted the question firms have to answer. It’s not just whether they identified vulnerable customers or offered them some form of support – it’s whether those customers ended up with outcomes comparable to resilient ones. That’s a harder question, and it needs different data to answer. These questions, drawn from an industry Q&A, work through the practicalities of measuring outcomes for vulnerable customers: what to measure, how to build up evidence when the data is thin, how to document good and bad outcomes, and how to evidence appropriate treatment beyond asking customers what they think.

How can you evidence good customer outcomes?

There’s no single measure that does the job. Most firms land on a combination of data points and feedback that, taken together, build up a picture of what customers experience.

The main sources fall into a few categories.

Routine monitoring of operational data. Completion rates, claim acceptance rates, lapse and persistency rates, first-contact resolution, time to complete common tasks. These are already being measured in most firms – the work is in breaking them down by vulnerable cohorts (and comparing them to resilient cohorts) rather than looking only at the aggregate picture.
Surveys, both one-off and continuing. Outcome-focused surveys (different from customer satisfaction surveys – more on that later) that ask customers specifically about the decisions they made, whether they understood them, and whether the service did what it was meant to. These can run continuously at low volume, or as deeper, periodic snapshots.
Focus groups and qualitative research. Useful for getting underneath the numbers – why a cohort reports a particular problem; what would make a product genuinely better. These are more expensive to run, but necessary when the quantitative data points at something without explaining it.
Claims and complaints data. A useful source, but not enough on its own. Complaints in particular are lagging and selective – plenty of vulnerable customers won’t complain even when things have gone wrong, so a low complaint rate isn’t proof of good outcomes.
Proxies and benchmarks for outcomes that take years to play out. For some products (retirement planning, long-term investments, certain life cover), the real outcome is only visible decades after the interaction that shaped it. Waiting isn’t an option – firms need proxies for the data. Does the customer have a will in place? Have they engaged with the key decisions the product assumes they’ll make? Benchmarking against expected behaviour for the product and cohort can help to fill the gap.

One specific point worth being clear about: outcomes measurement is not the same as customer service measurement. Net promoter score, Trustpilot ratings, satisfaction surveys of the ‘how was your call today?’ kind – these measure how the customer felt about the interaction. They don’t measure whether the customer ended up with the right product, understood what they’d bought, or got genuine value from it. All of those service metrics have a place, but they don’t answer the outcomes question.

The FCA has been clear across its multi-firm reviews that firms relying on existing management information alone are unlikely to meet the bar. New data points, cohort-level slicing, and the chain from identification through mitigation to outcome are all needed. And the aim is continuous improvement – once one area of poor outcomes has been understood and addressed, measurement should shift to examine the next area in more detail. This isn’t a project with an end date; it’s an ongoing discipline.

How can we measure outcomes for vulnerable customers specifically?

The core test, as set out in Consumer Duty, is whether vulnerable cohorts receive outcomes at least as good as resilient customers. If a firm has genuine evidence that vulnerable customers aren’t doing worse, then it doesn’t need to change its processes. But the evidence from charities and disability groups, and from the FCA’s own reviews, consistently points to vulnerable cohorts ending up with worse outcomes across multiple measures. Assuming that no gap exists without the data to prove it isn’t a defensible position.

There’s a foundational problem that has to be solved first. You can’t measure outcomes for vulnerable customers if you haven’t identified who’s vulnerable. Without identification, any cohort analysis is impossible, and firms are reduced to anecdotal evidence from front-line staff – which might be informative but isn’t quantifiable and doesn’t evidence anything at the board or regulator level.

Once identification is in place, two practical methods get firms to cohort-level outcomes data.

Method A – combined surveys. Run a survey that captures vulnerability characteristics and outcome-relevant questions at the same time. Ask about the customer’s circumstances, ask about their experience and the outcomes they’ve had, then correlate the two. This is quicker to set up, because it doesn’t depend on having individual vulnerability records already in place. It’s a good first step for firms whose vulnerability data is still limited.
Method B – correlating customer vulnerability records with outcome data. Where customers’ vulnerabilities have been properly identified and recorded in a structured form, the firm can correlate that individual data with every outcome metric it already tracks. This is the richer option, because it covers the whole book rather than a sampled survey response, and it uses operational data rather than self-reported experience. It only works where the vulnerability data exists in a form the firm’s analytics can actually use – structured, consistent, tied to the customer record.

Most firms will use both. Method A can start now, even while customer vulnerability data is being built up. Method B becomes more powerful as the underlying identification process matures. The CII’s 2025 guidance on managing customer vulnerability similarly recommends a layered approach, starting where you can and deepening over time.

One point worth being explicit about: the aim isn’t to exclude customers from service because they’re vulnerable, or to treat vulnerability itself as a poor outcome. Plenty of vulnerable customers are precisely the people a product is meant to serve. Someone in debt, for example, may be vulnerable at the start and at the end of a debt-management product – that doesn’t mean the firm has failed. The question is whether the customer benefited from the product or ended up worse off because of it. The vulnerability label matters less than the direction of travel.

Where technology captures vulnerability data in a structured form from the outset, outcome reporting by cohort becomes largely automated – the data is already set up to support it, and management information can be produced as a by-product of the normal customer journey rather than assembled painstakingly after the fact.

What specific measures should we use to evidence each of the four Consumer Duty outcomes for vulnerable cohorts?

The four outcomes need different measures, but each can be broken down by vulnerability cohort using data most firms already hold. A workable starter set looks like this.

Products and services. Do vulnerable customers end up in products that fit their needs? Useful measures include suitability at point of sale (product matched to circumstances), take-up of product features by cohort, cancellation rates and reasons (are vulnerable customers cancelling because the product doesn’t work for them?), and complaints citing product mismatch. For advised products, ratings of the advice given, split by cohort.
Price and value. Are vulnerable cohorts getting fair value? Look at price paid by cohort, benefits received and used, claim acceptance rates, persistency and renewal pricing. Watch specifically for cross-subsidies where vulnerable customers are paying more for comparable or worse service. Identify legacy pricing practices that catch specific cohorts disproportionately.
Consumer understanding. Did customers actually understand what they bought or were being told? Comprehension check results at point of sale, time-to-decision patterns, rates of inbound queries after major communications, ‘I didn’t realise’ style complaints, errors in completing forms or processes. For complex decisions, post-decision confirmation that the customer remembers what they chose and why.
Consumer support. Did the firm help effectively when it was needed? First-contact resolution rates, time to resolution, complaint volumes and themes, accessibility of channels (phone availability for customers who can’t use digital channels, for instance), take-up of support offered to vulnerable customers, and whether that support actually helped.

Across all four, the real measure is the gap between vulnerable and resilient cohorts on the same metric. Where the gap is small and consistent, the firm is in reasonable shape. Where there are meaningful gaps, those become the investigation list.

Two practical notes. These aren’t a fixed set – firms should adapt to their product mix. And each measure needs to be sliced by cohort from the outset, because a blended average almost always hides the cohort-level story.

How should you measure outcomes for a debt recovery team, where most customers could be considered vulnerable by circumstance?

This is a genuine challenge – when nearly every customer a team deals with is vulnerable, the usual vulnerable-versus-resilient comparison breaks down. But that doesn’t mean outcomes can’t be measured. It means the comparison has to be built differently.

A few approaches that tend to work.

Segment within the vulnerable population. Customers in debt aren’t all vulnerable in the same way. Some are temporarily vulnerable because of a specific life event (redundancy, illness, bereavement). Some are vulnerable primarily through financial resilience. Some have multiple overlapping vulnerabilities (mental health alongside financial stress, for instance). Severity varies widely. Outcomes for each of these sub-cohorts can and should be compared against each other. A customer with straightforward financial vulnerability and no other factors should be achieving better outcomes than one facing multiple, interacting vulnerabilities – if they’re not, something’s worth looking at.
Compare against expected or target outcomes for the product. Debt recovery has specific, measurable outcomes: sustainable repayment plans set up, customers engaged and responsive, affordable arrangements maintained over time, customers ultimately clearing their debt or moving onto a stable footing. Each of these can be measured directly. Firms can set target outcomes at the design stage and measure actual performance against them, cohort by cohort.
Track progression out of vulnerability where that’s the intent. For products designed to help people through a difficult period – debt management, hardship support, recovery schemes – a good outcome often means the customer is in better shape at the end than at the start. Measure that directly. Percentage of customers whose circumstances have improved, percentage who have cleared their debt, percentage who have stabilised, and so on. The rates over time tell you whether the product is working.
Compare against industry benchmarks where available. Debt advice bodies like StepChange, the Money and Pensions Service, and the Money Advice Trust produce sector data on outcomes for people in debt. Firms can benchmark their own performance against this.
Look at what happens after. For a debt recovery process, a meaningful outcome is what the customer’s financial position looks like a year or two after the intervention. If the customer is back in difficulty a year later, the ‘successful’ resolution wasn’t sustainable. Longer-term tracking, even sample-based, reveals this.
Use qualitative evidence seriously. Focus groups, customer interviews, and feedback from debt advice charities about how your team is perceived all carry weight here. They’re harder to reduce to a dashboard but they’re genuine evidence of outcomes.

The FCA’s Consumer Credit Sourcebook and its work on Borrowers in Financial Difficulty set the regulatory expectations for consumer credit specifically, and make clear that firms must assess affordability, engage sympathetically, and document the basis for decisions. Evidencing outcomes for this customer base means showing the firm is doing all of that well, not that it’s serving a resilient customer base.

The common weakness the FCA has flagged is debt recovery teams treating high vulnerability levels as a reason they can’t measure outcomes – when, if anything, the opposite is true. High vulnerability makes outcome measurement more important, not less.

Could you give examples of how good and bad outcomes for vulnerable customers can be documented or expressed?

Yes, and using worked examples is often the clearest way to make this land. A good outcome and a bad outcome can look very similar on a surface-level metric, so the framing matters.

Example: dyslexia and a mortgage application:

Bad outcome. The customer struggles to read the documentation, doesn’t want to admit it, signs without fully understanding what they’ve committed to, and later discovers features of the mortgage they didn’t realise were there. A complaint follows years into the product. On the firm’s records, this looks like a successfully-sold mortgage that later generated a complaint. No one joined the dots at the time.
Good outcome. At the assessment stage, the adviser identifies that the customer has dyslexia and prefers verbal explanation. Key features are explained through a short video the customer can rewatch. The adviser verifies understanding before the application completes – ‘just so I know I’ve explained this well, can you tell me in your own words what happens if rates rise?’ The customer signs up fully informed, and their satisfaction with the product remains stable over time.

This would be documented as:

Characteristic (dyslexia, moderate severity).
Mitigation offered (video-format documentation, verbal explanation, comprehension check).
Mitigation taken up (yes).
Outcome (customer demonstrated understanding, completion successful, no subsequent complaint or query indicating confusion). Comparable to the resilient cohort on all measures.

Example: bereavement and pension decumulation:

Bad outcome. Recently widowed customer phones about their deceased spouse’s pension. Under time pressure, speaking to whoever picks up, they make irreversible decisions about taking a lump sum that turn out later not to have been in their best interest. The firm records the interaction as a customer-instructed transaction.
Good outcome. The firm identifies the bereavement, flags that the customer is in a vulnerable state, and pauses for a cooling-off period before any major irreversible decisions. A specialist team takes over. Options are explained in writing and verbally, with a named contact point. The customer is encouraged (but not required) to involve a trusted family member or take independent advice. The final decision is taken a few weeks later, with full information and in a less acute emotional state.

This would be documented as:

Characteristic (recent bereavement, severity high in the immediate period).
Mitigation offered (cooling-off period, specialist team, extended timeline, signposting to independent advice).
Mitigation taken up (partially – customer declined independent advice but accepted specialist handling and the cooling-off period).
Outcome (decision taken in a more stable state, documented understanding, no subsequent complaint or regret-based contact).

Example: fluctuating mental health condition and a credit card:

Bad outcome. Customer has bipolar disorder. During a manic episode, they apply for credit and increase their spending dramatically. The application is approved on standard affordability criteria. The limit is raised when the account shows high usage. Six months later, the customer is in serious arrears, struggling, and increasingly distressed.
Good outcome. Customer has disclosed the condition and the firm has recorded it as a fluctuating vulnerability with periodic risk of poor financial decisions. When an unusual application pattern or spending spike occurs, the system prompts additional engagement rather than automatic approval. A conversation happens – offered, not demanded – checking that the customer genuinely wants the change. Limits are kept cautious. The customer stays on stable financial footing across their episodes.

This would be documented as:

Characteristic (bipolar disorder, fluctuating).
Mitigation strategy (enhanced engagement triggers on large or unusual applications, cautious credit limits, periodic check-ins).
Outcomes (stable account usage across episodes, no arrears, customer retains confidence in the firm, periodic feedback positive).

A few practical points about how to document these kinds of cases.

Keep it factual and specific. What the customer said or did, what was offered, what they took up, what happened afterwards. Avoid judgement language.
Link characteristic, mitigation and outcome. The chain is what makes it evidence. A recorded vulnerability with no mitigation looks like inaction. A recorded mitigation with no outcome tracking looks like activity without effect.
Track outcomes over time, not just at the point of interaction. A mortgage sold today has outcomes that only really show up over years. Keep the record live.
Aggregate to cohort level for reporting. Individual cases illustrate; cohort data evidences. You need both, but the board report lives at the cohort level.
Include negative examples. Evidencing good outcomes is strengthened, not weakened, by also capturing and learning from cases that went wrong. Firms that only document success look like they’re curating the record.

These examples aren’t templates to copy – they’re illustrations of what rich outcome documentation looks like when done well. Most firms will find their own patterns once they start capturing this kind of chain.

How can we measure outcomes for vulnerable customers when our data is still thin?

Start where you can, build from there, and be honest about the gap while you’re closing it. Nobody gets to complete outcome management information overnight, and the FCA’s position is that firms are expected to be on a credible trajectory towards it – not to have finished the work already.

A few practical steps for firms in this position.

Start with surveys that capture both vulnerability and outcomes together. If your underlying customer vulnerability identification is still patchy, a periodic survey can give you cohort-level outcome data right now. Run it across a representative sample of customers, ask about circumstances (in plain language, not using the word ‘vulnerable’), and ask about outcome-relevant experience. Correlate the two. You’ll have usable insights within a few weeks of launching, far sooner than you would by building up structured vulnerability records across the whole book.
Use proxy segmentation where direct data is missing. While you’re building up direct vulnerability data, some reasonable proxies will give you early signal. Age distribution, product features used, customer service interaction patterns, channel preferences. None of these is a reliable indicator of vulnerability on its own – and you shouldn’t label customers as vulnerable on the basis of proxies – but they can suggest where to look. If older customers on your book are showing a different pattern of outcomes, that’s a flag, even before you’ve done structured assessments.
Work with what you have, on a focused area. Rather than trying to cover everything, pick one product or customer journey and build a proper picture for it first. Most firms have more data than they realise once they look for it – complaints data, claims data, contact centre records, cancellation reasons. Running a structured analysis of one area, with a view to vulnerability-adjacent insights, often reveals more than a thin cover of everything.
Be transparent about what you can and can’t measure yet. A board report that says ‘here’s what we can evidence, here’s what we’re building, here’s the timeline’ is more credible than one that claims complete coverage. The FCA’s reviews have consistently praised firms that are candid about gaps and on a clear trajectory, and criticised firms that presented MI as complete when it plainly wasn’t.
Invest in structured identification as the priority. The fastest way out of thin data is to get the identification process right. Proactive assessment, structured fields, consistent categorisation, appropriate technology. Once the underlying data is in place, the reporting becomes straightforward. Once identification is running properly, correlating individual vulnerability data with outcomes becomes practical.
Benchmark where direct comparison isn’t possible. Industry data from Financial Lives, the Money and Pensions Service, sector reports, and (where available) peer firms. Benchmarks don’t tell you about your specific customers, but they tell you whether your aggregate numbers look plausible for a firm of your kind. A firm identifying 6% of customers as vulnerable can benchmark that figure against the roughly 50% population rate and draw useful conclusions, even without richer data.
Set a realistic plan and work it. Map out what complete outcome management information looks like for your firm, where you are today, and what the stages are to get from here to there. Most firms need at least twelve to eighteen months to build a mature cohort-level reporting capability. The FCA doesn’t expect instant perfection – it expects evidence of a plan and progress.

The common trap is firms deciding they can’t do anything useful until the full picture is in place. That’s rarely true. Partial data, honestly presented, is always more useful than nothing. And waiting for perfect data leaves customers without the improvements that even partial insight could drive.

A lot of vulnerable customer management information measures the nature of vulnerability and the support offered. How do you robustly evidence that vulnerable customers are actually getting outcomes at least as good as all customers?

This is the core Consumer Duty question, and it’s where many firms’ current management information falls short. Measuring what you’re doing is useful but it isn’t the same as measuring what customers are getting.

Robust evidence typically has several dimensions.

Outcome comparison at cohort level, across the four Consumer Duty outcomes. For each outcome – product and service, price and value, consumer understanding, consumer support – a direct comparison of vulnerable and resilient cohorts, using the same underlying measures for both. Completion rates, claim acceptance rates, persistency, lapse rates, complaint rates, comprehension scores, first-contact resolution. If the gap is small and consistent, the firm is in good shape. If meaningful gaps show up, those are the things to investigate.
The chain from characteristic to outcome. Not just ‘we identified this customer as vulnerable’ and ‘this customer had a good outcome’ – but the causal chain between the two. What was the characteristic, what mitigation was offered, was it taken up, and what was the outcome? Without the chain, you’ve got two data points that may or may not be related. With the chain, you’ve got evidence.
Comparable customer feedback by cohort. Outcome-focused feedback (not customer service feedback), asked of vulnerable and resilient customers using the same instrument. If the two cohorts answer the same question differently, that’s a cohort-level insight. If they answer similarly, that’s evidence the firm is meeting the Duty.
Behavioural data by cohort. What customers actually do, as distinct from what they say. Use rates of product features, completion of key actions, engagement with important communications, successful navigation of critical decisions. If vulnerable customers are demonstrably doing all these things at comparable rates to resilient ones, that’s strong evidence.
Complaint and query patterns by cohort. Not just volumes but causes. Are vulnerable customers complaining or querying about specific things that resilient customers aren’t? That tells you something about where the service is failing them specifically.
Longitudinal tracking. One snapshot is weaker than a trend. If the cohort gap is stable over time, that’s a different finding from one that’s widening or closing. The FCA expects firms to be on a trajectory, and trajectory is only visible across time.
External corroboration where available. Independent research, charity sector data, published peer benchmarks. External reference points strengthen the internal picture and make it harder to ignore.

Where all these sources converge on the same conclusion – vulnerable customers are receiving comparable outcomes, or they aren’t – you’ve got robust evidence. Where they disagree, there’s more work to do to understand why.

A practical note on presentation. Evidence becomes much stronger when the firm can show the loop – issue identified in the data, investigation done, action taken, outcome improved. A single snapshot can be ambiguous; a sustained improvement story over time is hard to argue with. The board reports the FCA has responded best to in its reviews have consistently had this loop structure running through them.

A common weakness is management that lists what the firm did rather than what customers got. ‘We trained X staff, updated Y policies, delivered Z initiatives’ is input data. The outcome question is what happened next. Input data is useful as context, but it isn’t evidence of outcomes and it won’t satisfy the FCA on its own.

How big does a gap between vulnerable and resilient cohorts need to be before we act on it?

The FCA has deliberately not set a specific threshold, which leaves firms to make their own judgements. A few dimensions help frame the call.

Absolute size. A one-percentage-point gap in claim acceptance rates is very different from a ten-point gap. There’s no magic number, but a single-digit gap on most measures is more likely to be noise or margin than a fifteen or twenty-point gap.
Relative size. A two-point gap matters more on a high-stakes measure (a claim accepted or not) than on something peripheral. Think about what the measure represents for the customer, not just the number.
Number of customers affected. A gap that applies to 200 customers is a different matter from one that applies to 20,000. Small gaps on large populations can still be material in aggregate.
Severity of the outcome. Missed payments that lead to arrears and financial harm are more material than shorter decision times. The downstream consequences of the gap matter.
Stability over time. A gap that shows up in one quarter and disappears the next is less concerning than one that persists or widens. Trend matters.
Sample size and confidence. A gap based on 30 customers isn’t the same as one based on 30,000. Before treating a finding as real, check whether the cohort is large enough to support the conclusion.

A practical rule of thumb: if you’d be uncomfortable explaining the gap to a customer who fell on the wrong side of it, it’s probably worth acting on. ‘The product pays out for most people but not for you because you have depression’ isn’t defensible at any size, even if the aggregate claim acceptance rate looks fine.

The FCA’s position is that firms should investigate gaps, understand the cause, and act proportionately. Materiality is a judgement, not a number – but it’s a judgement the firm has to make and evidence, not avoid.

How do you identify what the right outcome for a vulnerable customer is – if a process can’t be changed, is clear communication to ensure understanding enough?

The short answer: clear communication is always necessary, but it isn’t always sufficient. The right outcome is whatever outcome the product or service was genuinely meant to deliver for the customer – and if a process can’t currently achieve that for a specific cohort, the process probably needs to change.

The whole aim of Consumer Duty is to reduce poor outcomes for consumers. Firms have to measure poor outcomes, but they also have to put strategies in place to minimise them. That may well mean changes to systems, to the firms they work with, to how products are designed and sold, to the communication channels used. There’s no inherent limit to what should be on the table if it’s needed to reduce poor outcomes.

A useful framework for thinking about this:

First, define what a good outcome looks like. For a given product and customer cohort, what does a good outcome actually mean? Customer ended up in an appropriate product for their circumstances? Understood what they bought? Used it successfully? Achieved their underlying objective? Wasn’t disadvantaged by a feature they couldn’t access or didn’t understand? Be specific – ‘the customer was happy’ isn’t a good outcome definition, because happiness at the point of sale can coexist with genuine harm that surfaces later.
Second, check whether current processes can actually achieve that outcome for each cohort. If they can, with appropriate adjustments, the communication-and-adjustment path is the right one. If they can’t, communication isn’t going to close the gap on its own.
Third, where processes can’t deliver the good outcome, look for the real change. The change might be to the product itself (redesign a feature that disadvantages a cohort), the sales process (add a cooling-off period, require comprehension checks), the service model (specialist teams for complex cases), the channel mix (make sure phone is an option for customers who can’t use digital channels), the pricing approach, the underwriting criteria, the distribution chain. Any of these might be the right answer, depending on what the data shows.
Fourth, prioritise. Not every gap is equally material. Start with the worst poor outcomes – the cohorts being hurt most, the products where the gap is widest, the issues that affect the most customers. Work down from there. This is an ongoing process, not a fix-once exercise.

Some specific points worth being clear about:

Clear communication is never the whole answer on its own. A customer who fully understands a product that doesn’t suit them has understood something bad, not achieved a good outcome. Communication is necessary for informed decision-making; it isn’t sufficient for appropriate product-customer matching.
‘We can’t change the process’ is rarely true. It’s often the case that a process can’t be changed quickly, or without cost, or within the current technology. But those are constraints to work through, not conclusions. The FCA’s expectation is that firms do what’s necessary, proportionately and on a reasonable timeline, to prevent foreseeable harm.
Legacy products and legacy systems aren’t a defence. If a product delivers poor outcomes for a vulnerable cohort, ‘we’ve always done it this way’ doesn’t meet the bar. Legacy issues are where the most significant improvement often happens.
Inclusive design reduces the ‘process can’t be changed’ problem at source. Products and services designed from the outset to work for a wide range of customers rarely create the cohort gaps that then have to be patched. Where the gap exists in a legacy product, redesign is often the right answer – and for new products, inclusive design should be the default.

The FCA's findings on Consumer Duty board reports have not been flattering, and the signal is that firms treating communication or training as sufficient responses to cohort outcome gaps shouldn’t expect that position to be sustainable. The direction of travel is clear: the FCA is likely to start using its powers to drive the issue, and firms whose plans rest on the assumption that clear communication is enough may be in for a surprise.

The honest underlying test is whether the firm has done what a reasonable person would do if they knew what the firm knows. If the data shows a cohort being hurt and the firm has the capacity to act, acting is what’s required – whether that means communication, adjustment, redesign or something more fundamental.

How do you measure outcomes for long-duration products where the real outcome is decades away?

Waiting isn’t an option – customers buying pensions, life cover, long-term investments or mortgages need to be served well now, even though the full outcome only shows up much later. The practical answer is to measure intermediate markers that correlate with eventual outcomes, and to use proxies that stand in for the long-run result.

A few useful intermediate measures.

Point-of-sale suitability. Did the product fit the customer at the point of purchase? Affordability assessments, needs analysis, and understanding of the key trade-offs are all measurable at the time. Where these are weak, long-run outcomes are almost certainly going to be weaker too.
Engagement with key decisions. Long-duration products often involve important decisions along the way – pension contribution increases, investment rebalancing, nominating beneficiaries, making a will where relevant. Whether customers are engaging with these is an early signal. Low engagement by vulnerable cohorts is an actionable finding now, even though the consequences only show up later.
Behavioural proxies. Lapse rates in the early years of a long product, payment pattern stability, changes in product features used. These predict long-run outcomes with reasonable accuracy.
Industry benchmarks. Sector research reports on long-duration outcomes for comparable products and customer cohorts. Where a firm’s intermediate data looks similar to the sector, that’s a reassuring signal. Where it diverges, that’s worth investigating.
Targeted sampling. For specific long-duration products, periodic deep reviews of a sample of customers who bought several years ago, tracking what’s happened since. This doesn’t give you book-wide coverage, but it gives you real evidence on actual outcomes over real timescales.

On proxies specifically, the principle is to pick something the firm can observe now that correlates well with the eventual outcome. For retirement planning, having a will in place, an up-to-date nomination, and an understanding of decumulation options are all reasonable proxies for a good retirement outcome – and all can be measured now. For long-term investments, engagement with statements, understanding of charges, and stability of strategy are useful markers.

The FCA accepts that long-duration outcomes can’t be fully evidenced in real time, but expects firms to be measuring what they can measure and to be building up a picture over years. ‘We’ll know in 2045’ isn’t an acceptable position – the firm needs to show it’s doing the best it can with the signals available now.

Who in the firm should own outcome measurement for vulnerable customers?

Nobody owns it alone, but ownership has to be clear and someone has to be accountable. The pattern that works in most firms is one that distributes the work sensibly and names a lead.

A named senior owner. Someone at executive level who is accountable for outcomes across the firm, and specifically for vulnerable cohorts. This might be the Consumer Duty champion or another senior executive accountable for customer outcomes. The point is that one person has their name on the work and the authority to drive change.
Customer outcomes or conduct function owns the measurement framework. What’s measured, how, at what frequency, against what standard. This is where the design of the programme sits, and where the reporting pack is produced.
Operations delivers the day-to-day measurement. Front-line teams, case handling, complaints, and the operational management information that feeds the outcomes pack. The people closest to the customer need to own the quality of what gets recorded.
Compliance and risk provide challenge. An independent view on whether the measurement is genuinely robust and whether the firm is interpreting the data honestly. This is the ‘would it hold up to regulatory scrutiny?’ lens.
The data protection officer covers data handling. Vulnerable customer data is special category under UK GDPR, so the data protection officer needs to be involved in how it’s captured, used for measurement, and protected.
The Consumer Duty champion connects it to the board. The FCA expects the champion to be actively involved in outcomes monitoring and to raise issues with the board. They don’t own the measurement, but they’re the link that makes sure the board is genuinely engaging with it.
The board owns the outcome. Annual Consumer Duty reporting lands at board level, and the board has to engage with the cohort-level picture and the decisions that come out of it.

What doesn’t work is any of three patterns:

Ownership sitting only in compliance (which tends to produce reporting that satisfies the regulator but doesn’t actually drive customer improvement).
Ownership sitting only in operations (which can produce rich management information, but without the independent challenge that makes it credible).
Diffused ownership where no one person is accountable – the surest way for outcome measurement to drift.

For smaller firms, several of these roles may sit with the same person. That’s fine, provided the distinct responsibilities are clear and someone independent of the operational delivery can challenge the findings. The structure matters less than the clarity.

How do you measure whether the treatment of vulnerable customers is appropriate, without relying only on their own feedback?

Customer feedback is valuable but it has limits – vulnerable customers in particular may be less willing to complain, may assume poor treatment is normal, may feel unable to challenge a firm, or may not recognise that they’ve been poorly served. Relying on their feedback alone is a weak form of evidence. Several other sources, triangulated, give a much more robust picture.

Behavioural data. What customers actually do, which is usually more reliable than what they say. Successful completion rates for key actions, appropriate use of product features, decisions made in line with the product’s design, time taken to complete tasks, abandonment rates on digital journeys. Where vulnerable cohorts behave differently from resilient ones on these measures, something’s worth investigating – regardless of whether they’ve complained.
Comparison with expected outcomes for the product and cohort. For most products, there’s a reasonable expectation of what a good outcome looks like. Customer takes up the product, uses it appropriately, benefits from its core features, sees it through to its intended end. Where vulnerable cohorts deviate from the expected pattern – higher lapse rates, lower feature use, more incomplete journeys, more emergency escalations – that’s evidence of potential inappropriate treatment, even without direct feedback.
Observed interactions. Call recordings (with appropriate consent and controls), digital journey analytics, case file reviews. These give you evidence of how customers are being treated in practice. Quality assurance programmes can specifically examine interactions with identified vulnerable customers against the firm’s standards. This is useful both for spotting individual issues and for identifying patterns.
Front-line staff insights. People who work with customers day in and day out often spot things before they show up in data. Structured feedback from contact centre staff, advisers, and case handlers can surface issues in real time. This isn’t quantitative, but it’s informed observation from people closest to the customer experience. It’s worth formalising – most firms gather this anecdotally but don’t systematically feed it into their outcome monitoring.
Third-party feedback. Information from charities, debt advice bodies, advocacy groups, carers, independent advisers. Where a vulnerable customer interacts with the firm through a third party, that third party often has a clearer view of how appropriate the treatment has been than the customer themselves. Many firms build relationships with organisations like StepChange, Mind, Age UK, and Macmillan specifically so that this kind of feedback flows back to them.
Independent mystery-shopping and research. Carefully designed programmes that simulate specific customer scenarios can test how the firm actually responds to vulnerable circumstances in practice. This is more expensive than other approaches, but it can reveal gaps between policy and delivery that nothing else catches.
Complaints root-cause analysis. Complaints from vulnerable customers are worth analysing in their own right, but so are complaints from representatives of vulnerable customers (family members, advocates, solicitors). Looking at the underlying themes rather than the complaint volume tells you what’s going wrong.
Regulatory and industry intelligence. Issues the FCA has flagged at other firms, common themes in sector reviews, areas the ombudsman service has focused on. If the sector as a whole is showing a particular weakness, there’s a reasonable chance the same weakness exists in any individual firm until specifically tested.
External assessment. Independent audits or assurance reviews of vulnerable customer handling, carried out by specialists. These can test a firm’s actual practice against industry good practice and regulatory expectations, and produce findings that carry weight.
Outcome tracking over time. Customer circumstances, interactions and outcomes can be tracked longitudinally, with a specific view on whether the treatment received was appropriate given what was known about them. Retrospective review of a sample of cases is a useful discipline.

The strength of this approach is that it doesn’t depend on any single source. No one of these sources is perfect on its own, but taken together they build up a picture that’s far more robust than customer feedback alone – and that can be evidenced to the board and the regulator in a way that a satisfaction score can’t.

A practical note: where the different sources disagree, the disagreement is itself an insight. Behavioural data showing a problem that customers aren’t complaining about probably means they either don’t realise or don’t feel able to say so. Complaints about a service that behavioural data suggests is working well may point to perception or communication issues rather than substantive ones. Both are worth understanding.