Scoring Methodology
Every shared review and comparison on AInalyzer is more than a summary of what other sites already wrote. The page renders proprietary, per-product derived data — an overall score, a four-dimension Score Breakdown, a Source Consensus chart, and an aggregated community vote — assembled by our own scoring framework. This page explains how each number is produced so you can judge how much weight to give it.
The 1-10 AInalyzer Score
The overall AInalyzer Score sits on a 1-10 scale and is anchored to expert consensus across the source set, not to brand popularity or marketing claims. A high score requires alignment across multiple independent reviewers; a single negative outlier cannot push the score down by more than 0.5 points.
- 9-10 — Highly recommended. Excels in its category with minimal trade-offs and broad reviewer agreement.
- 8 — Recommended. Strong product with clear strengths; minor caveats noted across sources.
- 6-7 — Worth considering. Solid in some dimensions, weak in others; right for the right buyer.
- 3-5 — Not recommended. Significant issues outweigh the strengths for most use cases.
- 1-2 — Avoid. Recurring problems across the source set; better options exist at every price point.
Lifecycle matters: a phone with three-year-old hardware is judged against today's alternatives, not against the year it launched. Value matters too — a budget product that outperforms its price tier can score above a premium product that doesn't justify its cost.
The four sub-dimensions
Every shared review page renders a Score Breakdown widget with four sub-scores derived from the source analysis. The sub-scores are constrained to be consistent with the overall score — a product rated 5/10 overall cannot have a 9 on every dimension.
Performance
How well the product does its primary job in the conditions that buyers actually use it. Anchored to benchmark data when sources cite specific numbers, otherwise to expert consensus.
Value
Specs-per-dollar relative to alternatives at the same price point. A product can score high on performance and still score low on value if a cheaper option delivers most of the benefit.
Reliability
Frequency and severity of recurring complaints across the source set — build issues, software bugs, durability concerns, support problems. A product with a small number of very loud failure modes can score lower here than its headline performance suggests.
Hype vs reality
Whether what reviewers found in real use matches what the marketing claimed. Products that overdeliver on understated promises score above 7; products with marketing-page features reviewers couldn't reproduce score below 4.
Source Consensus widget
Below the Score Breakdown we render a Source Consensus widget — a per-theme view of what reviewers actually concluded. Each row names a buyer-relevant theme (battery life, low-light camera, build quality, etc.), tags the prevailing sentiment as positive / mixed / negative, and assigns an agreement percentage reflecting how aligned the source set is on that point.
A 90% positive on battery life means almost every reviewer praised it. A 55% mixed means the source set is genuinely divided — some praise the trade-off, others flag it. The widget surfaces disagreement instead of hiding it inside an averaged score.
The themes themselves are picked per product category, not from a fixed checklist. A laptop's consensus rows include keyboard feel and thermals; a TV's rows include HDR handling and motion smoothing. This per-page specificity is the point: it's data tailored to what buyers of this product actually want to know.
Source weighting
Not every reviewer carries the same weight in the analysis. Our pipeline ranks source domains in three tiers:
- Tier 1 — Specialists with measured data. GSMArena, RTINGS, Notebookcheck, DxOMark, Geekbench. Their numerical benchmarks anchor performance sub-scores.
- Tier 2 — Expert generalists. Tom's Guide, TechRadar, PCMag, The Verge, CNET, LaptopMag, DigitalTrends, SoundGuys, WhatHiFi. Their narrative reviews drive the verdict and trade-off framing.
- Tier 3 — User signals. Reddit threads, Amazon reviews, owner forums. These surface long-tail issues professional reviews miss — a recurring driver problem six months after launch, a regional warranty quirk.
Each shared page lists the domains it actually drew from, immediately under the AInalyzer Score badge, so you can see which mix produced this specific result.
How the page is assembled
When a user requests a review, AInalyzer runs a five-angle grounded search using Google Search Grounding integrated with Gemini 2.5 Flash. The angles cover expert reviews, recurring problems, specs, pricing, and release context — all in a single grounded call. The model then synthesises the source set into the structured fields you see on the shared page: pros, cons, summary, the score, the four-dimension breakdown, and the consensus rows.
Inline citations [1] [2] [3] map every factual claim back to a specific source URL. A claim without a citation does not appear on the page. The Score Breakdown reasoning and the Source Consensus rows are constrained to be grounded in the same evidence — they cannot introduce new claims the analysis didn't already establish.
Comparisons follow the same pipeline twice — once per product — and add a head-to-head verdict step that picks a winner per dimension. The comparison page renders its own Score Breakdown (winner per dimension) and Consensus widget (which side reviewers favour on each theme) on top of the per-product data.
Community-agreement chip
Each shared page that has received votes shows a small chip under the title — for example, 78% community agreement · 412 votes. This is a tally of AInalyzer visitors who voted Agree or Disagree with the verdict on that page. One vote per device, with the option to change your mind. It is independent of the AI score and provides a lightweight reality check on whether the analysis matches what actual buyers think.
A high agreement percentage on a low-scoring product is a stronger signal than the score on its own — it means real users see the same problems the AI flagged. A low agreement percentage on a high-scoring product is a flag worth taking seriously.
What this framework can and can't do
The AInalyzer score is a synthesis of public reviews. It is not a substitute for hands-on testing. It works well for products with broad coverage — phones, laptops, headphones, TVs, kitchen appliances — where dozens of independent reviews converge into stable signal. It is less reliable for very new launches with only a handful of sources, for niche professional gear with thin coverage, or for regional products where most coverage isn't in English.
For high-stakes purchases we recommend treating the AInalyzer page as the starting point of your research — a structured briefing with the receipts attached — and then opening the cited sources to dig deeper on the dimensions that matter most to you.