Dataset Assembly Tool – Social Wellbeing Agency

Social Wellbeing Agency

Read how the Social Wellbeing Agency (SWA) provides an open source tool and instructions to make Integrated Data Infrastructure’s (IDI) data wrangling faster and easier for researchers.


The IDI brings multiple datasets together to answer questions about complex issues that affect New Zealanders. However, because the format of every dataset is different, researchers must do significant work to prepare and assemble data for each project. This reduces efficiency and increases the costs of conducting research.


An open source assembly tool was created to automate the creation of research ready datasets. Researchers conduct a small amount of preparation and enter the details of what they want to be assembled into control files. The tool then reads these control files and produces an output in a standard format. This automation can halve the time taken to prepare data, saving weeks of staff time.

Observations and lessons learnt

Collaboration with the public and non-government entities


How can others adopt the product itself as well as the methodology, mindset, and approach to building other products? There must be detailed documentation (guidance, training docs, design principles) to accompany the product to support implementation. For example, employing and communicating to users the innovative design principles that tangibly guided how and what work was done aims to enable a mindset of product design.

Processes and adaptability


The product was built alongside fixed deliverables for a given project. This shows an ability to deliver what was needed for a project while changing a process to deliver future value by solving an overarching problem. The sentiment of "squeezing incremental improvements into projects" from their manager enabled the approach.

I was able to think and create how to do something differently while delivering a project the old way.


The tool isn’t everything to everyone. Instead, it aimed to target one part of the process, and achieved that objective. This allows others to build and innovate on other parts of the data process.


Things always take more time than you think.

Attitude towards innovation


This project was owned and delivered by a passionate individual. It likely may not have happened without their leadership and management’s trust in them.

If I didn’t work here tomorrow, is there enough documentation to ensure its usefulness?


EARLY 2020

Tool created

Alongside another project, one SWA staff member self-created this tool and process.

LATE 2020

Statistics NZ

Initiated a Statistics NZ discussion to ensure they are on board and understand, as questions will likely go to them.

DEC 2020

Public roll-out

Guidance and training documentation made public.


EARLY 2021


An MSD colleague is looking to improve another part of the wider process (the data preparation) through standard definitions.

EARLY 2021


Focus on promotion to a wider audience of researchers so that they know of and use the tool.

More Case Studies

Read how HQSC facilitates and supports DHBs to learn from each other to reduce/eliminate seclusion practices and to make progress on some of their biggest challenges, through the use of data, a unique operating model and online sessions.

Learn how HQSC and their Pacific community partners co-developed the Bula Sautu Report by getting comfortable with radical uncertainty, working to each other’s strengths, and establishing shared values and trust.

Read how Creative HQ created a process to intentionally enable colleagues to connect and learn more about each other in order to grow mutual respect and understanding within the team.