data.gv.at MCP Server Logodata.gv.at MCP

Finding Datasets

Locate datasets matching your search criteria using text queries and filters

Discover datasets by searching with natural language, keywords, or structured filters.

When to use this guide

Use this guide when you need to:

  • Locate datasets about a specific topic
  • Filter by theme, format, or publisher
  • Refine broad search results

Prerequisites

  • data.gv.at MCP Server connected in Claude Desktop
  • Understanding of your search requirements (topic, format preferences)

Ask Claude in natural language:

Find datasets about Vienna population

Claude uses semantic search automatically and shows relevant datasets.

Good for:

  • Exploratory searches
  • Natural language queries
  • Quick discovery

Ask Claude with requirements

Tell Claude your filters in natural language:

Find CSV datasets about health from Vienna

Claude uses semantic_search_datasets or search_datasets automatically.

Try it:

  • "Show me environment datasets"
  • "Find recent population data in JSON format"
  • "Search for social datasets from Austrian government"

Direct API call with faceted filters

search_datasets(
    query="bevölkerung",
    themes=["SOCI"],
    formats=["CSV", "JSON"],
    publishers=["stadt-wien"],
    min_date="2024-01-01",
    boost_quality=True,
    sort_by="modified_desc",
    limit=50
)

Parameters:

Prop

Type

Returns:

{
  "results": [
    {
      "id": "bev-stat-wien-2024",
      "title": "Bevölkerung Wien 2020-2024",
      "description": "Quarterly population statistics by district...",
      "quality_score": 87
    }
  ],
  "count": 42,
  "facets": {
    "themes": {"SOCI": 30, "HEAL": 12},
    "formats": {"CSV": 25, "JSON": 17}
  }
}

Error handling:

try:
    results = search_datasets(query="bevölkerung")
except ToolError as e:
    print(f"Search failed: {e}")
    # API may be temporarily unavailable

Semantic search for natural queries

Semantic search understands the meaning behind your query, not just keywords. It finds conceptually related datasets even when they don't contain exact search terms.

Visual example: Semantic search in action

The following screenshot demonstrates how semantic search expands natural language queries to find relevant datasets:

Claude Desktop showing semantic_search_datasets tool with natural_query parameter set to datasets about public health in Austrian cities. The tool performs language detection identifying English, then expands the query with semantic understanding to include related EU DCAT-AP themes HEAL for health data, SOCI for social statistics, and REGI for regional city data. Results display five datasets: Vienna Health Services with quality score 85, Urban Health Indicators Austria at 82, Public Health Monitoring Salzburg at 79, Healthcare Access Vienna at 76, and Disease Prevention Programs at 74. Each result shows title, description, theme tags, and quality score, demonstrating how semantic expansion discovers datasets relevant to the query concept even without exact keyword matches in titles.

Semantic Search

This is a placeholder image. Real Claude Desktop screenshots showing data.gv.at MCP Server semantic search with query expansion will be added soon. The screenshot will demonstrate actual language detection, theme expansion, and conceptually relevant results.

Key features of semantic search shown:

  • Language detection: Automatically identifies query language (English/German)
  • Theme expansion: Maps natural query to EU themes (HEAL, SOCI, REGI)
  • Conceptual matching: Finds datasets about the topic even without exact keyword matches
  • Quality indicators: Shows metadata completeness scores for each result

Natural language queries

Use complete questions or descriptions:

Find datasets about Vienna's air quality monitoring stations

Claude automatically expands query with synonyms and related themes.

When to use semantic search:

  • You have a question in natural language
  • You want related datasets (not exact keyword matches)
  • You need multi-language support (German/English)

Direct semantic search call

semantic_search_datasets(
    natural_query="Luftqualität Wien Messstationen",
    formats=["CSV"],
    boost_quality=True
)

What happens:

  1. Language detection (German vs English)
  2. Query expansion via LLM (adds synonyms, related themes)
  3. Standard search with expanded terms
  4. Quality boost applied if enabled

Response includes expansion info:

{
  "results": [...],
  "count": 15,
  "expansion_info": {
    "detected_language": "de",
    "semantic_themes": ["ENVI"],
    "confidence": 0.85
  }
}

Error handling:

If semantic expansion fails, falls back to original query:

{
  "results": [...],
  "expansion_info": {
    "fallback": true,
    "reason": "Low confidence expansion"
  }
}

Theme-based filtering

Filter by topic area

Tell Claude which themes you need:

Find datasets about health and social topics

Claude maps topics to EU DCAT-AP theme codes automatically.

Direct theme filtering

Use EU DCAT-AP theme codes:

search_datasets(
    themes=["HEAL", "SOCI"],
    limit=50
)

Available themes:

  • AGRI: Agriculture, fisheries, forestry
  • ECON: Economy and finance
  • EDUC: Education, culture, sport
  • ENER: Energy
  • ENVI: Environment
  • GOVE: Government and public sector
  • HEAL: Health
  • INTR: International issues
  • JUST: Justice, legal system, public safety
  • REGI: Regions and cities
  • SOCI: Population and society
  • TECH: Science and technology
  • TRAN: Transport

Filtering logic:

  • Multiple themes: OR within same facet (HEAL OR SOCI)
  • Combine with other filters: AND between facets (themes AND formats)

Troubleshooting

Search returns no results

Symptom: Count is 0, no datasets found

Causes:

  • Query too specific
  • Filters too restrictive
  • Spelling errors in query

Solutions:

  1. Try broader query terms
  2. Remove format/publisher filters
  3. Check theme codes (must be uppercase: SOCI not soci)
  4. Use semantic_search_datasets for query expansion

Semantic search returns irrelevant results

Symptom: Results don't match query intent

Cause: Language detection incorrect or semantic expansion too broad

Solutions:

  1. Check expansion_info.detected_language in response
  2. If wrong language, use direct search_datasets() instead
  3. Add explicit theme filters to constrain expansion
  4. Use more specific query terms

Quality boost returns no results

Symptom: Search with boost_quality=True returns empty

Cause: Quality filter too strict for domain

Solutions:

  1. Try search without boost_quality first
  2. Review quality scores of all results
  3. Lower quality threshold expectations for niche domains

Facets show unexpected counts

Symptom: Facet counts don't match expected distribution

Cause: Multiple filters applied, counts reflect intersections

Solutions:

  1. Remove filters to see full facet distribution
  2. Check if datasets have multiple themes (counted in each)
  3. Review filtered vs unfiltered search results

Next steps

How is this guide?

Last updated on

On this page