The challenges of data management and analysis on a large longitudinal qualitative research project
Computer aided qualitative data analysis has the potential to revolutionise both the scale of research and possible analysis techniques. Yet, the software itself still imposes limits that hinder and prevent this full potential from being realised. This post looks at the large and complex dataset created as part of the Welfare Conditionality research project, the analytical approach adopted, and the challenges QDAS faces.
The Welfare Conditionality project has two broad research questions in setting out to consider the issues surrounding sanctions, support, and behaviour change. Firstly, is conditionality ‘effective’ – and if so for whom, under what conditions, and by what definition of effective. And, secondly, whether welfare conditionality is ‘ethical’ – how do people justify or criticise its use and for what reasons. To answer these questions, we have undertaken the ambitious task of collecting a trove of qualitative data on conditional forms of welfare. Our work across nine policy areas, each of which has a dedicated ‘policy team’ that is responsible for the research. The policy areas are: unemployed people, Universal Credit claimants, lone parents, disabled people, social tenants, homeless people, individuals/families subject to antisocial behaviour orders or family intervention projects, (ex-)offenders, and migrants. Research has consisted of 45 interviews with policy stakeholders (MPs, civil servants, heads of charities), 27 focus groups with service providers, and three waves of repeat qualitative interviews with 481 welfare service users across 10 interview locations in England and Scotland.
Our first task relating to data management and analysis, was how to deal with the logistics of storing and organising data on this scale. One of our key protocols has been the creation of a centralised Excel sheet used to collate participant information, contact details, and the stage each interview is at. It tells us, for example, when the interview recording has been uploaded to a shared network drive, transcribed, anonymised, added to our NVivo project file, case node created, attributes assigned, auto-coded, and coded & summarised in a framework matrix. On the analysis side, we have been using the server edition of NVivo. It became clear early into the fieldwork that working with multiple stand-alone project files that would be regularly merged and then redistributed would be impractical – with a high risk of merge conflicts arising due to the complexity of our data. The server project means multiple team members can access and work in the project file at the same time.
Another emerging challenge was the difficulty for team members to be involved in time-intensive fieldwork and dedicate sufficient time to analysis. We also needed to find an analytical approach which could offer information at a range of levels i.e. by individual over time; as well as across and within the policy areas and welfare domains under investigation. There was debate amongst team members on having each policy team independently doing their own analysis versus a shared approach. Some felt a shared approach would be too time consuming compared to coding for specific outputs and that there were not enough commonalities between all the policy areas for there to be a workable shared approach. Others felt that coding for specific outputs would result in unnecessary repetition of analysis and make it difficult to reach general conclusions across the whole sample.