Designing the Read API for Get Information About Schools (GIAS)

What we were trying to solve

Users of Get Information About Schools (GIAS) often need structured school and group data for research, analysis, or operational systems. The existing GIAS interface is designed for people, not machines meaning organisations frequently scrape pages or handle large CSV downloads to get the data they need.

We needed a cleaner, more predictable way for systems to access GIAS data directly. This led us to design a read only API that exposes Establishments (schools) and Establishment Groups in a consistent way.

What we built

We designed a prototype API that:

lets clients request school or group data through simple HTTP endpoints
returns only the fields the client actually needs
streams large datasets efficiently
has predictable behaviour and clear error responses
is easy to extend as new domains (like workforce or governance) come on stream

Although the API is technical behind the scenes, the main idea is very simple: Let systems fetch GIAS data without scraping or downloading everything.

How the API works in plain language

1. A client sends a request

Examples:

“Give me the establishment with URN 123456”
“Give me all academies, but only return name, postcode, and status”
“Give me details for this trust”.

The API checks that required parameters are present - for example, asking for a “list” endpoint without choosing fields results in a helpful error message.

2. The system gathers the data

The work is handled in layers:

Use cases decide what should happen when a request is made
Repositories decide how to get the data (currently SQL, but this could change)
Mappers turn raw database rows into properly structured domain objects
Domain models enforce rules (e.g. “a school must have a valid name”).

Each layer has one job, which makes the system easier to maintain.

3. The domain checks the data is valid

Before anything gets returned, the core domain model verifies the data:

Are the fields complete?
Do they follow business rules?
Are values like postcodes or contact details valid?

If anything breaks a rule, the request fails safely.

4. The API returns the result

Clients get data in one of two formats:

JSON, streamed for efficiency
CSV, if requested

Error handling is consistent:

missing items → 404
unexpected issues → 500
unexpected issues → 500
everything else → 200 OK

The API only returns what was requested, so if a client only wants the “name” and “postcode”, it won’t receive extra data.

Why we took a clean architecture approach

We chose a layered “clean architecture” style so the core business rules stay stable even if:

the database changes
the API format changes
new domains are added
we change how we shape or map data.

This helps ensure:

predictable behaviour
easy testing (business logic can run without a database)
easier long term maintenance
safer changes as the system grows.

Future extensions (like governance, workforce or financial data) can follow the same pattern.

What we learned

Streaming is essential. Some datasets are too large to load in memory
Allowing clients to choose fields prevents over fetching and keeps responses small
Separating domain rules from infrastructure helps reduce regressions
Consistent logging across use cases greatly improves traceability when debugging.

What’s next

The prototype lays the groundwork for:

adding more domains of GIAS data
validating with internal and external data consumers
planning how the API could replace current CSV downloads or scraping
exploring authentication, rate limiting, and versioning approaches.

Written by Spencer O’Hegarty, Senior Software Engineer

What we were trying to solve

What we built

How the API works in plain language

1. A client sends a request

2. The system gathers the data

3. The domain checks the data is valid

4. The API returns the result

Why we took a clean architecture approach

What we learned

What’s next

Share this page

Tags

Is this page useful?

Give feedback about this page

Cookies on the Design History service

Designing the Read API for Get Information About Schools (GIAS)

What we were trying to solve

What we built

How the API works in plain language

1. A client sends a request

2. The system gathers the data

3. The domain checks the data is valid

4. The API returns the result

Why we took a clean architecture approach

What we learned

What’s next

Share this page

Tags

Is this page useful?