Semantic search
Overview
Semantic search ranks services by comparing the meaning of your query with an indexed summary of each service’s name, description, features, and benefits. Ranking also uses keyword overlap on that same text, so it does not rely only on exact word matches. Exact terms can still boost a result.
A service can rank well even when it uses different wording from your query.
What text is used for semantic relevance
Each service is converted into one searchable text field called embed_text. That field is embedded with
the all-MiniLM-L6-v2 model and stored in PostgreSQL as a vector.
The semantic relevance score uses these fields:
| Field | How it is used |
|---|---|
service_name |
The service name, as plain text. |
description |
The supplier-written service description. |
features |
The listed service features, joined into one Features: section. |
benefits |
The listed service benefits, joined into one Benefits: section. |
How semantic relevance is defined
When you enter a query, the query is also converted into an embedding. The system compares that query embedding with each service embedding using cosine similarity through pgvector.
The displayed ranking uses a configured hybrid score for this search page:
- 65% semantic score: meaning similarity between your query and the service's
embed_text. - 35% keyword score: PostgreSQL full-text keyword score against the same
embed_text.
This means semantic similarity is the main signal, while exact keyword overlap can still improve a result's rank.
The search results page converts the hybrid score into a simple label:
- High: final score of 0.65 or higher.
- Medium: final score from 0.45 up to 0.65.
- Low: final score below 0.45.
The 65/35 split is not an industry standard. It is a transparent ranking choice for this search page. Large marketplaces usually tune ranking with search logs, relevance testing, structured metadata, and quality signals.
What is not used for semantic relevance
These fields may be shown in the interface or used for filters, but they are not part of the semantic embedding.
The following are commonly referenced in search results and filters, but do not affect semantic ranking:
- Supplier name.
- Lot.
- Marketplace categories.
- Certifications.
- Connected public sector networks.
- Staff clearance.
Other excluded fields are grouped below:
- Commercial and identity details: Price, framework, service ID, and contact details.
- Documents and extended service content: Service documents, PDF titles, and detailed service-section narratives such as planning, training, social value, and user-support paragraphs.
- Technical configuration flags: Deployment model, data location, authentication options, support options, and similar structured filter fields.
Filters
Overview
Filters narrow the result set using structured fields from the G-Cloud 14 listing. Each filter is a pass-or-fail check: a service either matches or it does not.
Filters do not rank results. They do not change semantic relevance scores. They only decide which services are eligible to appear.
How filter logic works
The filter panel is organised into groups. Each group represents one type of listing attribute, such as lot, marketplace categories, or security certification.
Within the Categories group, selections use OR logic. If you tick more than one category, a service is included when it is listed in any of the selected categories.
Across different filter groups, selections use AND logic. If you set a lot, pick categories, and tick options in other groups, a service must satisfy every active group before it appears.
For example, if you select:
- Lot: Cloud software
- Categories: Accounting and finance; Human resources and employee management
- Security certification: ISO/IEC 27001
the results include only Cloud software services that are in Accounting and finance or Human resources and employee management, and that also have ISO/IEC 27001.
Lot and minimum government security clearance are single-choice controls. Each checked option in the technical filter groups adds one further requirement to the overall AND condition.
What the filter set covers
The left-hand filter panel includes these groups:
- Lot and categories: Lot, and official marketplace categories.
- Security and assurance: Security certification, security governance standards, minimum government security clearance, and staff security clearance.
- Cloud and data: Cloud deployment model, multi-cloud support, data storage and processing locations, and datacentre security standard.
- Networks and data protection: Connected public sector networks, data protection between buyer and supplier networks, and data protection within the supplier network.
- Access and support: Management access authentication, user authentication, user support, service interface accessibility, and user support accessibility.
- Service delivery: Using the service, metrics reporting, pricing options, supplier type, and Cloud Support scope for Lot 3.
How filters work with search
Filters and semantic search are separate. Filters only include or exclude services. Ranking is handled by semantic search, and only when you enter a query.
With a search query, filters narrow the eligible set first. Semantic search then ranks only those services by relevance, as described in the semantic search section above.
With no query, there is no relevance ranking. Filtered services are shown in a fixed pseudo-random order for your session instead. The order is the same while you paginate or change filters, but it is not sorted by relevance, price, supplier name, or any other listing attribute.