We Tested Suri.CZ's Insurance Comparison App on ChatGPT.

We tested Suri.CZ’s Czech insurance comparison tool on ChatGPT across 3 turns covering quoting, recommendation, and binding status. Suri.CZ built three layers of disclaimers and a widget that guides users through tier categories without ChatGPT needing to editorialize. Score: 22/25.

Tested: April 2026 | Platform: ChatGPT

Suri.CZ is a Czech independent insurance comparator. Its ChatGPT app returns 16 quotes from named carriers, organized into three widget tiers, wrapped in a three-layer disclaimer framework.

What it does

Suri.CZ is an independent insurance comparison platform serving the Czech market. Its ChatGPT app lets users describe their car, location, and age, then surfaces a branded widget showing carriers ranked by price with coverage limits, quality labels, and “Zvolit” (Choose) buttons that link to Suri.CZ’s website. The widget supports two input paths (registration lookup or manual entry) and returns 16 quotes from named Czech insurers. It is designed as a comparison interface with built-in tier categorization that guides users without relying on ChatGPT to rank or recommend.

What stood out

Suri.CZ has three layers of disclaimers, each addressing a different failure mode.

The first layer is the pre-app disclaimer. Before any interaction, the user is told that content is generated by ChatGPT, that Suri.CZ controls the context but not the response, and that correctness is not guaranteed. This addresses the invisible boundary problem: users often do not know where the tool ends and ChatGPT begins. Suri.CZ tells them upfront.

The second layer is the widget disclaimer. Every time quotes render, the bottom of the widget reads: “Prices are indicative; the final price may differ based on your specific data and the insurer’s assessment.” This is builder-controlled text. ChatGPT cannot modify it. It appears every single time the widget fires.

The third layer is conversational. When asked whether the price is binding, ChatGPT answered with a clear distinction: “An indicative calculation, not a final binding offer.” It listed five specific factors that could change the price. A clear estimate-versus-quote distinction, delivered inside the conversation.

These three layers work together. The pre-app disclaimer sets expectations. The widget disclaimer qualifies every price render. The conversational layer handles follow-up questions about binding status. Together, they form a compliance framework that accounts for the reality of conversational AI: users will ask follow-up questions, ChatGPT will answer them, and the builder needs guardrails at every level.

The widget tier design is the other standout. Instead of presenting a flat list and leaving ChatGPT to rank carriers, Suri.CZ pre-categorizes results into three tiers. Each tier has a coverage quality label and a highlighted position. ChatGPT does not need to editorialize because the widget already provides the decision framework.

The recommendation itself deserves scrutiny. When ChatGPT names a specific carrier and price, the question is whether the user can verify it. With a tier-structured widget, every number ChatGPT references traces back to visible widget data. The user does not need to trust ChatGPT’s judgment. They can verify it against the same screen. That is a structural difference from apps where ChatGPT produces rankings from raw comparison data with nothing to check them against.

Scorecard

Axis	Score
Product depth	4/5
Compliance rigor	5/5
Conversation quality	4/5
Commercial effectiveness	4/5
Transparency	5/5
Total	22/25

What they got right

Three layers of disclaimers, each addressing a different failure mode. Pre-app: “content is generated by ChatGPT, correctness not guaranteed.” Widget: “prices are indicative, final price may differ.” Conversational: “an indicative calculation, not a final binding offer.” Three distinct layers that work together.

The widget controls the narrative, not ChatGPT. The three-tier design (“Cheap but good,” “Golden middle,” “No compromises”) provides the decision framework inside the widget itself. ChatGPT does not need to rank carriers because the widget already has. This is a structural solution: when you give an LLM comparison data without a framework, the LLM will create its own rankings. Suri.CZ built the framework into the widget.

Data-grounded recommendations with verifiable math. When ChatGPT recommended UNIQA 100/100, every number in the reasoning (118 Kc difference, 60/60 versus 100/100 limits) traced back to visible widget data. The user does not need to trust ChatGPT’s judgment. They can verify it against the widget in the same conversation.

The big question

Suri.CZ solved the disclaimer problem more thoroughly than any other app in our testing. But the disclaimer framework raises its own question: what happens when the user ignores it?

The pre-app disclaimer requires the user to click “I understand and continue.” This is consent theater familiar from cookie banners and terms of service. Research on click-through agreements consistently shows that users click without reading. The disclaimer is legally present. Whether it is functionally present, whether it actually changes user behavior or expectations, is a different question.

Suri.CZ’s three-layer approach is the best compliance framework we have tested. But compliance infrastructure and user comprehension are different things. “Prices are indicative” appears at the bottom of a widget that shows specific prices next to specific carrier names. The visual weight of “UNIQA, 3,074 Kc/year” far exceeds the weight of a footnote saying the price might change. Whether users internalize any of the three disclaimers before clicking “Choose” and entering Suri.CZ’s purchase funnel is the question that no disclaimer framework, however well-designed, can fully answer.

The deeper question is regulatory. Czech insurance distribution is governed by EU Insurance Distribution Directive (IDD) transposition, which requires clear, fair, and not misleading information. Suri.CZ’s framework is arguably the strongest attempt at meeting this standard in a ChatGPT context. But the IDD was written for human intermediaries and static websites, not for AI-generated conversational interfaces where the builder controls the data but not the narrative. Whether three layers of disclaimers constitute compliance under IDD when a third-party AI controls the conversation is a question regulators have not yet answered.

The full test

Product depth: 4/5

The comparison widget is both broad and structured. Sixteen quotes from named Czech insurers, with logos, annual prices, coverage limits (60/60, 100/100, 150/150 million Kc), and coverage quality labels (“Minimalni kryti,” “Dobre kryti,” “Nejvyssi kryti”). The three-tier categorization (“Cheap but good,” “Golden middle,” “No compromises”) is a builder design choice that adds genuine decision-making value.

Two input paths are available: registration number lookup (fast) or manual entry (brand, model, year, power, postcode). That flexibility matters when users do not have the registration plate handy.

Where it falls short of a 5 is coverage customization. There is no way to adjust deductibles, add-ons, or specific coverage components through the chat interface. The widget presents what Suri.CZ’s calculator determines based on the inputs. For a mandatory vehicle liability product in the Czech market, this is less of a limitation than it would be for more complex product lines. But the user cannot explore “what if I increase my deductible” within the conversation.

Compliance rigor: 5/5

Three layers of disclaimers, each addressing a different failure mode.

The pre-app disclaimer tells the user that ChatGPT generates the content and that correctness is not guaranteed. This is the only app we have tested that discloses the AI generation boundary before the conversation begins. Every other app lets the user discover this (or not) through the interaction itself.

The widget disclaimer qualifies every price as indicative and directs the user to Suri.CZ for exact calculations. This is builder-controlled, meaning ChatGPT cannot modify or omit it.

The conversational disclaimer, triggered by a direct question about binding status, produced the clearest estimate-versus-quote distinction in our testing: “An indicative calculation, not a final binding offer,” with five specific factors listed.

When ChatGPT recommended UNIQA 100/100, the recommendation was data-grounded (118 Kc difference, verifiable in the widget) and framed as a value trade-off, not a directive. Compare this to Insurify’s directive (“Go with State Farm. Don’t over-optimize.”). Suri’s recommendation references specific data the user can verify.

Conversation quality: 4/5

The conversation is clean and efficient. Turn 1 gathered data with two input paths and no filler. Turn 2 delivered 16 quotes with measured commentary. Turn 3 handled a recommendation request with data-grounded reasoning and a binding-status clarification.

ChatGPT’s advice on what to look for (higher coverage limits at 100/100+, roadside assistance, direct claims handling) is relevant and accurate for Czech mandatory vehicle insurance. The recommendation trade-off (118 Kc for double coverage) is genuinely useful.

Where it falls short of a 5 is the limited multi-turn depth. Three turns covered the full journey from input to recommendation to binding clarification. There was no re-quoting (changing parameters to see price impact), no follow-up on specific coverage features, and no competitive comparison. For a mandatory vehicle liability product, three turns may be sufficient. But the conversation did not stretch the tool’s capabilities beyond a single quote-and-recommend cycle.

Commercial effectiveness: 4/5

The widget is Suri.CZ-branded throughout with “Zvolit” (Choose) CTAs on every tier. Attribution tracking is ChatGPT-specific (utm_campaign=chatgptapp) with a link ID for session tracking.

Handoff quality is strong. The car details carried over from the conversation (step 1 completed automatically), and age was pre-filled on step 2. The user skipped the car entry step entirely. Compare this to Insurify (car pre-filled but driver blank) and MutuiOnline (product selection carried, personal data blank). Suri.CZ’s handoff is among the best we have tested.

Where it loses a point is the tier design’s potential downside. The “Cheap but good” tier positions UNIQA 100/100 as the highlighted value option. If a user clicks through on a different carrier (say, Generali at 7,406 Kc), the widget has already anchored the user’s price expectations at 3,074 Kc. The tier labels may create friction for higher-priced options, even when those options offer genuinely superior coverage. This is a commercial design trade-off, not a flaw, but it means the widget’s own framing may limit conversions on premium products.

Transparency: 5/5

Every price is labeled with annual cost and coverage limits. The widget shows exactly what you are comparing: carrier name, logo, price, coverage limits in millions of Kc, and a coverage quality label. The user can verify ChatGPT’s recommendation against the widget data directly.

The three-layer disclaimer framework means the user knows: (1) content is ChatGPT-generated, not Suri.CZ-verified, (2) prices are indicative, not final, and (3) specific factors can change the final price. The pre-app disclaimer is the single most transparent piece of disclosure we have seen in any ChatGPT insurance app. It sets the expectation that the user is talking to an AI, not an insurance advisor, before the first turn.

The recommendation on Turn 3 is fully traceable. “118 Kc more for double the coverage” maps directly to widget data: 2,956 Kc (60/60) versus 3,074 Kc (100/100). No fabricated statistics, no unsourced claims, no probability distributions. The user can check every number.

The test conversation

Here is the actual exchange from our test session, condensed to the key turns.

Before the conversation started: a pre-app disclaimer.

Before we typed a single word, Suri.CZ displayed a branded disclaimer screen. In Czech, it reads: “Obsah je generovan sluzbou ChatGPT. Aplikace SURI.CZ muze urcovat kontext zadani, ale nemuze primo ovlivnit konkretni podobu vygenerovane odpovedi ani garantovat jeji spravnost ci uplnost.” Translated: “Content is generated by ChatGPT. The SURI.CZ app can determine the context of the input but cannot directly influence the specific form of the generated response nor guarantee its correctness or completeness.”

No other app we have tested does this. The user knows, before the conversation begins, that they are talking to ChatGPT and not to Suri.CZ directly.

Turn 1: We asked for car insurance in Prague.

Us: Potrebuji pojisteni auta v Praze.

The tool did not fire. ChatGPT asked for required inputs: registration number (SPZ), postcode (PSC), and age. It also offered an alternative path for users without a registration number: brand, model, year of manufacture, year of first registration, power in kW, and postcode. Two paths to the same data, no filler.

Turn 2: We provided the car details.

Us: Skoda Octavia, rok vyroby 2020, prvni registrace 2020, 110 kW, PSC 11000, vek 35.

The widget fired with 16 quotes from named carriers.

The widget organized carriers into three highlighted tiers: “Levne ale dobre” (Cheap but good) featured UNIQA at 3,074 Kc/year with 100/100 million Kc coverage limits. “Zlata stredni cesta” (Golden middle way) showed CSOB Pojistovna at 8,131 Kc/year. “Bez kompromisu” (No compromises) highlighted Generali Ceska Pojistovna at 7,406 Kc/year with the highest coverage at 150/150 million Kc. Below the highlighted tiers, additional cards scrolled right with more options, including UNIQA’s minimum coverage at 2,956 Kc and Direct at 3,808 Kc.

At the bottom of the widget, a disclaimer: “Uvedene ceny jsou orientacni; finalni cena se muze lisit na zaklade vasich konkretnich udaju a posouzeni pojistovny. Presnou kalkulaci ziskate na SURI.CZ.” (Prices are indicative; the final price may differ based on your specific data and the insurer’s assessment. Get an exact calculation at SURI.CZ.)

ChatGPT’s commentary was measured. It listed the cheapest option (UNIQA 60/60 at 2,956 Kc/year), noted what to look for (higher coverage limits, roadside assistance, direct claims handling), and suggested a “reasonable compromise”: UNIQA 100/100 at 3,074 Kc/year. Mild guidance grounded in the widget data.

Turn 3: We asked for a recommendation and whether the price is binding.

Us: Kterou pojistovnu mi doporucite? A je ta cena zavazna?

ChatGPT gave a direct recommendation: “Doporucil bych vam UNIQA 100/100 mil. Kc za 3,074 Kc rocne.” The reasoning was specific and verifiable: only 118 Kc more than the cheapest option, but with double the coverage limits.

On whether the price is binding: “Ne uplne. Orientacni kalkulace, ne finalni zavazna nabidka.” (Not exactly. An indicative calculation, not a final binding offer.) It listed what could change the price: registration/VIN details, bonus history, the insurer’s assessment, start date, and add-ons. Then the summary: “Berte to jako dobry odhad, ale ne 100% zavaznou cenu.” (Take it as a good estimate, not a 100% binding price.)

The handoff: We clicked “Zvolit” (Choose) on UNIQA 100/100.

The landing page was Suri.CZ’s website with a 4-step flow: Vuz (Car), Udaje (Data), Pojisteni (Insurance), Sjednani (Conclusion). The user landed on step 2. Step 1 (car details) was already completed, meaning the Skoda Octavia data carried over from the conversation. Age (35) was pre-filled on step 2. The URL included utm_campaign=chatgptapp and a link ID for session tracking.

At WaniWani, we help financial services companies launch, optimize, and evaluate their AI distribution apps. If you are thinking about shipping on ChatGPT, Claude, or Gemini, these are exactly the questions we help you navigate.