Using Fuzzy Set Qualitative Comparative Analysis, fsQCA [CASE STUDY]

By Sophia Greeley, ImpactReady Consulting Fellow 2013

Fuzzy-set analysis works from the assumption that the life isn’t black and white. Things can be more black, or more white, but rarely are they all one or the other. Certain questions, such as whether you are male or female are simple enough, but others such as “are you happy with the government’s economic policy?” tend to result in a more mixed answer, with a respondent being perhaps more than less in favour of the policy, but not completely happy or completely unhappy with it. Fuzzy-set analysis provides a way to capture this grey data.

For example, imagine doing a piece of research on what factors contribute to successful earthquake responses. The question requires that the researcher determines whether certain conditions were absent or present. This presents a challenge because within the social sciences determining if a condition exists is not a straightforward matter. For example, are funds disbursed or not? Usually they are partly disbursed. Were the beneficiaries consulted or not? The answer is often that they were consulted, but not adequately. Things are rarely one or another, they are often a bit one way, and also a bit the other way – “grey”, rather than black or white. Using dichotomous data – such as a condition being present or absent – can result in a significant loss of information. Fuzzy set analysis provides a solution to avoid this.

Let us say that the study conducts a qualitative comparative analysis (QCA) of twenty-one earthquakes (the cases). QCA is useful for phenomena that have complex explanations that depend on a combination of causes, which are best studied in conjunction with, rather than isolation of, each other. Since humanitarian responses depend on a great many things including the level of development of the country, an effective supply chain, funding, to name but a few, this methodology can be deemed appropriate for this study. A literature review is undertaken, and this identifies nine different variables that are thought to affect the success of a humanitarian response. The use of fuzzy-set data should allow greater accuracy in the analysis of these nine variables, as it does not require rounding-off the data for simplicity’s sake. Fuzzy set allows the greyness to be included in the analysis.

THE VARIABLES

To illustrate the use of fuzzy-set analysis, let’s focus on one of the nine variables – how developed a country is. The level of development in a country is thought to affect how successfully governments, NGOs and development agencies like the UN, are able to respond. The hypothesis states that the level of development and the success of a response will be positively correlated, with those in more developed countries more likely to be able to recover than those in less developed countries. To determine a country’s level of development, a suitable measure could be the United Nations Human Development Index (HDI). The HDI provides a score between zero and one for each country, as well as providing a country’s rank – in relation to other countries.

With dichotomous data in QCA, cases are given either a 1 to denote presence, or a 0 to denote absence, of a condition. In fuzzy set QCA (fsCQA) however, cases are assessed for their degree of membership within a condition, to enable a score between zero and one. To establish fuzzy set scores, conventional variables must be calibrated. Calibration requires that the measure of a variable conforms to external standards (unlike uncalibrated measures, where variable values are taken in relation to one another). Calibration draws on theoretical and substantial knowledge to produce a fuzzy set score that relates to the degree of membership in a set. To generate these scores, you just first specify the threshold for full membership of the condition (which gets a fuzzy score of 0.95), full non-membership (fuzzy score 0.05) and the cross-over point (fuzzy score 0.5) where the condition is as much present as it is absent.

For the example of HDI, fuzzy scores cannot be assigned using the country’s HDI rank position since this would not denote external standards – it only informs us of a country’s position in relation to another country. By contrast, the HDI score for a country can be used for calibration as these are external standards. From here, it is possible to decide which HDI score would imply full membership, full non-membership and the cross-over point.

The HDI categorises scores as: very high level of development, high level of development, medium level and a low level of development. The process of calibration should not be mechanical: it should draw on theoretical sense and apply social knowledge. Here, for example, a country which scores a very high level of development does not need to be considered differently than a country with a high level of development – countries in both categories are developed and hence the categories can be combined. Looking at the data, suitable thresholds for HDI were determined: 0.787 for full membership, 0.600 for the cross-over point and 0.470 for full non-membership. These thresholds are then used to convert variable values into fuzzy membership scores.

THE TRUTH TABLE

From here the process requires the use of fsQCA software developed by Charles Ragin. To construct the fuzzy scores, the three thresholds for each variable must be entered into the software. The software uses these thresholds to convert variable values into fuzzy membership scores, using transformations based on the logarithmic odds of full membership. The software analyses the data to produce a truth table, which displays all the possible combinations of causes leading to the outcome, which in this study is a successful humanitarian response to earthquakes.

The truth table lists all the logically-possible outcomes, which is 2k, where k is the number of causal conditions. In this study there are nine possible causal conditions, resulting in 29 configurations – a total of five hundred and twelve possible paths to the outcome. Each case is now considered as a configuration – a combination of the characteristics selected – and the software reports how many instances there are of each configuration. Since there is limited diversity in social phenomena, it can be expected that there are many configurations of which there is no empirical evidence. This study produced instances of thirteen configurations. The configurations for which there are no instances can be deleted from the truth table, thereby excluding them from the minimisation procedure.

Here is an edited version of the truth table for this study, after the deletion of non-relevant configurations.

 Dev Gov corrupt Supply Fund coord scale market capacity number success raw consist. PRI consist. Product
0 0 0 0 1 1 1 0 1 3 1 1 1 1
1 0 0 0 1 1 1 0 1 1 1 1 1 1
1 1 1 1 0 0 1 1 1 2 1 1 1 1
0 0 0 0 0 0 0 0 1 1 1 1 1 1
0 0 0 0 0 1 1 0 1 1 1 1 1 1
0 0 0 1 0 0 0 0 1 1 1 1 1 1
0 0 0 1 0 1 1 0 1 1 1 1 1 1
1 0 0 0 0 0 1 0 0 1 1 1 1 1
1 1 1 0 0 0 1 1 1 1 1 1 1 1
1 0 0 1 0 0 0 0 0 1 1 0.929577 0.166666 0.154929
0 0 0 1 0 0 1 0 1 3 1 0.910156 0.858896 0.781729
0 0 0 1 1 0 0 0 1 1 0 0.277108 0 0
0 0 0 1 1 0 1 0 1 1 0 0.195489 0 0
                         

The researcher must decide upon the level of consistency required. The raw consistency level, as shown on one of the right hand columns of the truth table above, indicates whether the membership score on the outcome is consistently higher than the membership score of the causal combination, as well as taking into consideration the strength of the membership scores. Stronger membership scores present more relevant cases. Ragin (2008b:78) suggests a consistency cut-off above 0.9. In the success column, a one is placed beside cases that meet the consistency threshold and a zero by those that do not. In the this study, the recommended consistency threshold of 0.9 is chosen. Looking at the above table, all but the last two configurations meet the threshold.

THREE SOLUTIONS

The truth table is now ready for the standard analysis, when the minimisation process occurs. The minimisation process uses the techniques of prime implicants and De Morgan’s Law to generate three possible solutions. The software produces a complex, a parsimonious and an intermediate solution. The three solutions generated represent configurations that are deemed to be sufficient for the outcome to occur and each is based on different assumptions.

Set-theoretics are used to interpret the solutions. Where instances of an outcome constitute a subset of instances of the cause, this condition is considered a necessary condition. In this study no necessary conditions were observed. To form a subset, the membership on the outcome, successful response, must be consistently less than the membership on the cause or configuration. A sufficient cause or combination of causes results when the membership score on the outcome is consistently higher than the membership score of the causal combination. The consistency score of a configuration is based on the minimum fuzzy score in each of the conditions (Kent 2008:4).

Ragin (2008a:44) defines set theoretic consistency as “the degree to which cases sharing a given combination of conditions agree in displaying the outcome in question”. It provides evidence of a sufficient cause or configuration by gauging how closely a perfect subset relation exists. Set theoretic coverage informs the researcher “the degree to which a cause or causal combination ‘accounts for’ instances of an outcome” (ibid). When studies have few cases, the coverage may be low, even when consistency is high, since there are often many paths to outcome with social phenomena. Coverage relates to empirical evidence rather than theory, and is similar to statistical variance. Consistency, on the other hand, is compared to statistical significance and Ragin (2008a, ch.3) emphasises that just as a statistical result with a significant-but-weak-correlation can exist, so too can highly consistent set theoretic relation with low coverage exist.

The three solutions follow different assumptions: the parsimonious solution, allows all counterfactuals, both “easy” and “difficult” ones, and so may deliver an explanation that is unrealistically parsimonious. The intermediate solution incorporates only “easy” counterfactuals, and is the simplest to interpret. Counterfactuals are useful when there is limited diversity, as is the case in humanitarian response to earthquakes. The distinction between “easy” and “difficult” counterfactuals concerns whether a counterfactual that is assumed to be redundant is included or excluded from the solution.

When there is a configuration that is known to produce a successful outcome, and a redundant counterfactual is included in the combination on the conjecture that this will still lead to the outcome, this is considered an “easy” counterfactual. A “difficult” counterfactual is the inverse – an assumed redundant condition is removed from a configuration known to lead to the outcome, under the notion that outcome will still occur. Thus Ragin suggests that the best approach to interpreting the results is to view them on a continuum, where the complex solution is at one end, the parsimonious at the other end, and the intermediate solution somewhere in between the two. Here the intermediate solutions constitute subsets of the parsimonious solution and supersets of the complex solution.

CONFIGURATIONS FOR SUCCESS

The intermediate solution is the simplest to interpret and the output table from the software for the intermediate solution is shown below. The bottom of the table shows four configurations leading to successful humanitarian response. One of the solutions is capacity*~scale *~fund (where ~ indicates negation and* indicates logical AND), so can be interpreted as: where there is a high level of inter-agency capacity, with a low scale of disaster and not enough funds, there will be a successful humanitarian response. The interpretation of ~fund is difficult – it is not thought that inadequate funding leads to a successful humanitarian response, equally it is present in three of the intermediate solution configurations so there may be some potential conclusions to draw about funding levels, which we will put to one side for the time being.

Theoretical sense must be applied throughout the analysis and since it is not thought that ~fund – denoting “inadequate funding” is a likely factor to contribute to successful response, Ragin (2008a:172) advocates the removal of inconsistent conditions to produce the optimal intermediate solutions. If ~fund is removed, based on theoretical grounds, this leaves the following causes and configurations as sufficient to produce a successful response to earthquakes:

dev
capacity*coordination
capacity*~scale
capacity*supply

Each of these solutions receives a high level of consistency, one at 0.95, one at 0.97 and the remaining two achieving a consistency score of 1.00, with the intermediate solution overall consistency at 0.97. The overall coverage of these four combinations in the intermediate solution is 0.92, indicating that these solutions combined cover ninety-two per cent of the earthquake responses.

The Intermediate solution – output produced following truth table analysis
**********************
*TRUTH TABLE ANALYSIS*
**********************
(Model: success = f(capacity, market, scale, coord, fund, supply, corrupt, gov, dev)
Rows: 552
Algorithm: Quine-McCluskey
True: 1
0 Matrix: 0L
Don't Care: -
--- INTERMEDIATE SOLUTION ---
frequency cutoff: 1.000000
consistency cutoff: 0.910156
Assumptions:
capacity (present)
market (present)
~scale (absent)
coord (present)
fund (present)
supply (present)
corrupt (present)
gov (present)
dev (present)
raw unique
coverage coverage consistency
---------- ---------- ----------
~fund*dev 0.546087 0.219130 0.978193
capacity*coord 0.369855 0.222029 1.000000
capacity*~scale*~fund 0.184348 0.023188 1.000000
capacity*~fund*supply 0.342029 0.071304 0.954693
solution coverage: 0.924638
solution consistency: 0.970785

For more information on using fsQCA, you can refer to the following aids:
Ragin, C.C. (1987) The Comparative Method: Moving Beyond Qualitative and Quantitative Strategies (Berkeley: University of California Press)
Ragin, C.C. (2000) Fuzzy-Set Social Science (Chicago: University of Chicago Press)
Ragin, C.C. (2008a) Redesigning Social Enquiry: Fuzzy Sets and Beyond (Chicago and London: University of Chicago Press)
Ragin, C.C. (2008b) User’s Guide to Fuzzy-Set /Qualitative Comparative Analysis 2.0 (Arizona: Department of Sociology, University of Arizona)

The fsQCA software can be downloaded free online at: www.u.arizona.edu/~cragin/fsQCA/software.shtml