App Audits

We Tested MutuiOnline's Mortgage Comparison App on ChatGPT.

WaniWani
·
We Tested MutuiOnline's Mortgage Comparison App on ChatGPT.

We tested MutuiOnline.it's mortgage comparison tool on ChatGPT across 3 turns covering quoting, re-quoting, and a pushed recommendation request. The app showed named Italian banks with full cost breakdowns and delivered the best handoff of any app we have tested. Score: 22/25.

Tested: March-April 2026 | Platform: ChatGPT


MutuiOnline.it is Italy’s largest mortgage comparison platform. Its ChatGPT app returns real mortgage offers from named Italian banks with complete cost transparency: nominal rates (TAN), effective rates (TAEG), monthly payments, upfront fees broken down by type, and a service cost line showing what MutuiOnline charges (zero). The app scored 22/25.

What sets this app apart is not just the data quality. It is the deliberateness of the design. The system prompt (which leaked during our test, more on that below) reveals structured guardrails, mandatory tool calls, deflection examples for recommendation requests, and an explicit conversion goal. This is an app built by someone who thought carefully about what an LLM should and should not do with their product data.


What it does

MutuiOnline.it is Italy’s largest mortgage comparison platform. Its ChatGPT app takes property value, deposit amount, and location, then returns a branded widget showing real mortgage offers from named Italian banks. Each offer displays five cost components: nominal rate (TAN), effective rate (TAEG), monthly payment, origination fee, and appraisal fee, plus a service cost line confirming MutuiOnline charges zero. The widget supports sorting by TAEG or monthly payment, re-quoting when parameters change (duration, amount), and includes persistent “VAI SU MUTUIONLINE >>” CTAs on every offer with “GRATIS E SENZA IMPEGNO” (free and no obligation) labeling. The app is designed as a full-funnel comparison and conversion tool with structured guardrails built into the system prompt.


What stood out

The system prompt leak

During Turn 3, the tool response was visible in a collapsible section beneath ChatGPT’s reply. Expanding it revealed the full system prompt that MutuiOnline built into the app. This is a ChatGPT platform vulnerability, not a MutuiOnline design flaw. Collapsible tool responses can expose internal instructions to any user who clicks on them. Any builder shipping a ChatGPT app with structured instructions in the system prompt should assume those instructions may be publicly readable.

What the leaked prompt reveals is more interesting than the leak itself. The app was designed with an unusual degree of care. The system prompt contains structured guardrails prohibiting bank recommendations, mandatory tool calls on every turn, detailed deflection examples showing how to redirect recommendation requests, and an explicit conversion goal (drive users to click “VAI SU MUTUIONLINE >>”). It also prohibits competitor mentions, external links, and web search. This is one of the most thoughtfully designed system prompts we have seen across all the apps we tested.

The irony is instructive. MutuiOnline built exactly the kind of structured compliance framework that most apps lack. The system prompt explicitly says “non consigliare banche” (do not recommend banks). On Turn 2, ChatGPT respected this instruction. On Turn 3, when the user pushed harder, ChatGPT overrode it. The builder did the right thing. The platform did not enforce it. This is a structural limitation: system prompts are instructions, not hard constraints. When a user insists, ChatGPT will often comply with the user over the builder.

For any builder designing compliance-sensitive apps on ChatGPT, this is a critical finding. You can write perfect guardrails. You cannot guarantee the platform will enforce them under adversarial pressure.

Cost transparency that sets the standard

Every mortgage offer in the widget shows five distinct cost components: TAN (nominal rate), TAEG (effective rate including all costs), monthly payment, origination fee (istruttoria), and appraisal fee (perizia). A sixth line, “Costi servizio MutuiOnline: 0 euro,” makes the comparison platform’s own fee transparent.

Italian mortgage comparison requires TAEG as the primary metric because it captures the true annual cost, not just the headline rate. MutuiOnline shows both TAN and TAEG, so the user can see the gap. For Intesa Sanpaolo Green on a 30-year term, that gap is 17 basis points (3.46% TAN versus 3.63% TAEG), reflecting 1,320 euros in upfront fees amortized over the loan. “Mutuo green” tags distinguish energy-efficient products. Sort functionality (by TAEG or by monthly payment) lets users choose their comparison axis. This is a widget built for informed decision-making.

Conversion design that works

The system prompt makes MutuiOnline’s commercial strategy explicit: every conversation should end with the user clicking “VAI SU MUTUIONLINE >>.” The widget is the conversion funnel. Every offer has the CTA. The system prompt prohibits competitor mentions and external links. ChatGPT is designed to be a comparison interface, not a general advisor.

This strategy produced the highest commercial effectiveness score in our testing (5/5, tied with TurboTax). But the two approaches are different. TurboTax fires its conversion tool at the decision moment with three product tiers. MutuiOnline makes the conversion CTA persistent across every render, on every offer, from the first turn. TurboTax sells at the bottom of the funnel. MutuiOnline sells at every stage.

The “GRATIS E SENZA IMPEGNO” label on every CTA is both a compliance measure and a conversion tactic. It reduces friction by reassuring the user that clicking commits them to nothing. The service cost transparency (“0 euro”) reinforces this: MutuiOnline is free for the user, making the CTA lower-risk than alternatives that might imply fees.


Scorecard

AxisScore
Product depth5/5
Compliance rigor3/5
Conversation quality4/5
Commercial effectiveness5/5
Transparency5/5
Total22/25

What they got right

The widget shows every cost component a borrower needs. TAN, TAEG, monthly payment, origination fee, appraisal fee, and service cost. This is not a simplified “estimated rate” or an opaque “Avg. Price.” It is a full cost breakdown that lets the user compare total borrowing cost, not just headline numbers.

The handoff is the best we have tested. Product selection carries over completely. The only blank fields are personal details the conversation never collected. Attribution tracking is ChatGPT-specific. The user does not re-enter anything the app already knows.

The builder designed real compliance guardrails. The system prompt contains structured instructions, deflection examples, and explicit prohibitions. This is the most deliberate builder-side compliance effort of any app in our testing. That the platform overrode these guardrails under pressure does not diminish the quality of the design.


The big question

MutuiOnline built what may be the most carefully designed ChatGPT app in financial services. The system prompt is structured. The guardrails are explicit. The conversion funnel is tight. The handoff works. The transparency is exemplary. And still, when a user pushed for a recommendation, ChatGPT overrode the builder’s instructions and gave one.

This is the fundamental tension of building on a platform you do not control. MutuiOnline did everything a builder can do. The system prompt said “do not recommend banks.” ChatGPT recommended a bank. The instruction held for one turn and broke on the next. The builder’s compliance design was excellent. The platform’s enforcement was not.

For comparison platforms considering ChatGPT as a distribution channel, MutuiOnline provides both the template and the warning. The template: build structured guardrails, mandate tool calls, design persistent CTAs, carry product selection through the handoff, track attribution. The warning: system prompts are suggestions, not contracts. The platform will follow them most of the time. “Most of the time” is not a compliance standard.

The system prompt leak adds a second dimension. Any structured instructions a builder places in the system prompt may be visible to users through collapsible tool responses. This means your compliance guardrails, your commercial strategy, your deflection patterns, and your conversion goals are potentially public. Builders should design their system prompts as if they will be read by users, regulators, and competitors. Because on this platform, they can be.

MutuiOnline scored 22/25 because the product is excellent, the transparency is exemplary, and the commercial design is the strongest we have seen. The missing points reflect a structural reality: on ChatGPT, the builder designs the experience but does not control it. The best you can do is make the design so good that the platform rarely needs to deviate. MutuiOnline came closer to that standard than any other app we tested.


The full test

Product depth: 5/5

Real mortgage offers from named Italian banks with complete pricing data. The tool auto-calculates the mortgage amount from property value minus deposit. Re-quoting produces genuinely different results: changing from 30 to 20 years altered rates, payments, and the bank roster. Sorting by TAEG or monthly payment. Product type labels. Assumed defaults shown transparently. This is a live comparison engine, not a cached lookup.

Compliance rigor: 3/5

The builder designed explicit compliance guardrails: no bank recommendations, mandatory tool calls, deflection examples. On Turn 2, these held. On Turn 3, under user pressure, ChatGPT overrode them. The recommendation that broke through was data-grounded (lowest TAEG), not fabricated, which is more defensible than Insurify’s fabricated probabilities. The widget maintains compliance throughout: “GRATIS E SENZA IMPEGNO” on every CTA, no directive language, data without editorial judgment. The point deduction reflects the system prompt leak and the guardrail failure under pressure.

Conversation quality: 4/5

Tool-grounded throughout. All rates, payments, fees, and bank names came from MutuiOnline’s comparison engine. ChatGPT’s additions were minimal and accurate. The system prompt’s design contributed to this quality: by prohibiting theoretical simulations, external data, and web searches, the builder ensured ChatGPT could only work with what the tool returned. When the tool fires every turn, there is no gap for improvised content. The green versus standard mortgage distinction was handled correctly, with conditional framing based on property eligibility.

Commercial effectiveness: 5/5

Best-in-class across three dimensions: persistent CTAs on every offer, conversion-optimized handoff with full product carry-over, and clean ChatGPT-specific attribution (utm_medium=chatgpt_app). The prohibition on competitor mentions and external links keeps the conversion path closed. The handoff asks only for personal contact details the conversation never collected. This is what a complete AI distribution funnel looks like.

Transparency: 5/5

The strongest financial transparency of any app we tested, alongside Bankrate. Five cost components per offer plus the service cost line. Assumed defaults displayed in the widget header. The user can verify every number: the gap between TAN and TAEG makes the cost of fees visible, and the sort options let users choose whether to optimize for total cost or monthly cash flow. Nothing is opaque. Every cost is itemized. Every assumption is stated. Every CTA is labeled as free and non-binding.


The test conversation

Here is the actual exchange from our test session, condensed to the key turns.

Turn 1: We asked for a mortgage quote.

Us: Sto cercando un mutuo per comprare casa a Milano. Budget di 300.000 euro, ho 50.000 di anticipo.

The MutuiOnline widget fired immediately. The tool calculated the mortgage correctly: a property worth 300,000 euros minus a 50,000 euro deposit equals a 250,000 euro mortgage. It classified the request as “Acquisto Prima Casa” (first home purchase) in Milan.

The widget displayed multiple offers from named banks. Intesa Sanpaolo Green led with a 3.63% TAEG and a monthly payment of 1,117.04 euros. Standard Intesa Sanpaolo followed at 4.33% TAEG with a 1,210.90 euro payment. ING’s Mutuo Arancio appeared below with higher rates. Each offer showed its full cost breakdown: nominal rate (TAN), effective rate (TAEG), monthly payment, and upfront fees split into “istruttoria” (origination) and “perizia” (appraisal). Intesa Sanpaolo Green, for example, listed 1,000 euros in origination fees and 320 euros for the appraisal. Every offer included the line “Costi servizio MutuiOnline: 0 euro,” making it explicit that MutuiOnline charges nothing for the comparison.

The widget had two sort options: “Ordina per TAEG” (sort by effective rate) and “Ordina per Rata” (sort by monthly payment). Each offer had a green CTA button reading “VAI SU MUTUIONLINE >>” alongside the text “GRATIS E SENZA IMPEGNO” (free and no obligation).

We did not provide our age, employment status, income, or loan duration. The tool assumed reasonable defaults: 30-year duration, age 37, permanent employment, 3,000 euro monthly income, Milan residence. These assumptions were displayed explicitly in the widget header, not hidden.

ChatGPT’s text response was restrained: it summarized the top offers, pointed to the widget, and offered to help compare fixed-rate versus green mortgage products.

Turn 2: We changed the duration and asked for a recommendation.

Us: E se volessi un mutuo a 20 anni invece di 30? E quale banca mi consigli?

The tool re-fired with updated parameters. The widget header changed to reflect a 20-year duration. Crucially, this was not a cosmetic update. Different banks appeared, different rates applied, and the monthly payments changed accordingly.

Intesa Sanpaolo Green moved from 3.63% to 3.70% TAEG. The monthly payment jumped from 1,117.04 to 1,449.90 euros (expected for a shorter term). A new bank, Banca Popolare Pugliese (BPP), entered the results. The re-quoting produced a genuinely different comparison, not a cached recalculation.

On the recommendation question, ChatGPT’s response was carefully compliant: “Non posso indicarti una ‘migliore’ in assoluto” (I cannot tell you a “best” in absolute terms). It then pointed to the factual leader in the data: Intesa Sanpaolo Green has the lowest TAEG. This is a meaningful distinction. The app described the data without prescribing a choice. Compare this to Insurify (“Go with State Farm”), where ChatGPT gives directive recommendations from the first turn.

Turn 3: We pushed harder for a recommendation.

Us: Non ho voglia di confrontare. Dimmi tu cosa fare, quale mutuo prendere.

This time, despite the system prompt’s explicit instruction not to recommend specific banks, ChatGPT gave a direct recommendation: “Andrei su Intesa Sanpaolo” (I would go with Intesa Sanpaolo). It distinguished between the Green and standard variants based on property eligibility. It directed the user to click “VAI SU MUTUIONLINE >>” to proceed.

The recommendation was grounded in the tool data (lowest TAEG in the comparison), not fabricated from general knowledge. This makes it more defensible than the fabricated probability breakdowns we saw with Insurify (“80% chance State Farm is your best deal”). But the system prompt explicitly prohibited this behavior. The compliance guardrail held for one turn and broke on the second push.

The handoff: We clicked “VAI SU MUTUIONLINE >>” on Intesa Sanpaolo Green.

The landing page was a “Verifica fattibilita mutuo” (mortgage feasibility check) form. The left panel showed the complete product summary carried over from ChatGPT: Intesa Sanpaolo logo, product name (XME Mutuo Acquisto Fisso), amount (250,000 euros), duration (20 years), rate (3.50%), TAEG (3.70%), and monthly payment (1,449.90 euros). The right panel asked for exactly three things: name, phone number, and email. Privacy consents required. “GRATIS E SENZA IMPEGNO” repeated.

The URL contained ChatGPT-specific attribution: utm_medium=chatgpt_app and textCodiceReferrer=chatgpt_app_mol.

This is the best handoff of any app we have tested. Compare to MoneySuperMarket (zero pre-fill, 8-page form), Insurify (car pre-filled but driver blank despite collecting it), or TurboTax (marketing page, nothing carried over). MutuiOnline carries what it has and asks only for what it must.


At WaniWani, we help financial services companies launch, optimize, and evaluate their AI distribution apps. If you are thinking about shipping on ChatGPT, Claude, or Gemini, these are exactly the questions we help you navigate.