Quantitative Methods: Regionalisation through Composite Indices

By: Sucharita Sen
Though there are some limitations of composite indices, they are an enabling tool for geographical regionalisation, a tool that is also becoming increasingly popular with policy makers. Development indices worked out using the human development index (HDI) and principal component analysis (PCA) have been compared in this article.

Geographers carve out areas from a heterogeneous space—defining physiographic, agro-climatic, manufacturing, development and various other regions. These form the lens through which they conduct further analysis. Since the issues dealt with are commonly multi-dimensional, there arises a need to bring the different facets together to enable regionalisation. A scholar engaged in gender geography may be interested in literacy differences, enrolment ratios in higher education, work participation rates, as well as wage rate differences between men and women to arrive at regions of gender deprivation. The need to understand multiple dimensions through a simplified form is thus a commonly encountered problem which is frequently handled by a process of aggregation of different dimensions to form composite indices. An index is a composite of indicators of phenomena like development, vulnerabilities, exclusions and the like. Other than a tool of regionalisation, composite indicators which compare country performance are increasingly being recognised as useful in policy analysis and public communication. One of the best known examples is the human development index (HDI).

Constructing Indices: Pros and Cons of Aggregation
There is a lot of debate about the desirability of aggregating a large number of variables into one single indicator that best represents the original data set. On one hand, the pro-aggregation groups argue that ‘composite indicators are a way of distilling reality into a manageable form’; on the other, the anti-aggregation lobby articulates ‘composite indicators are confusing entities whereby apples and pears are added up in the absence of a formal model or justification’.
■ Can summarise complex, multi-dimensional realities with a view to supporting decision-makers.
■ Are easier to interpret than a battery of many separate indicators.
■ Can assess progress of countries/states over time.
■ Reduce the visible size of a set of indicators without dropping the underlying information base.
■ Place issues of state/country performance and progress at the centre of the policy arena.

■ May send misleading policy messages if poorly constructed or misinterpreted.
■ May invite simplistic policy conclusions.
■ May be misused, e.g. to support a desired policy, if the construction process is not transparent and/or lacks sound statistical or conceptual principles.
■ The selection of indicators and weights could be the subject of political dispute.
■ May lead to inappropriate policies if dimensions of performance that are difficult to measure are ignored.

Notwithstanding the contrasting views, composite indicators represent an important theoretical progress and have some relevant advantages in terms of policy. They have experienced a surge of popularity because of their ability to represent complex concepts such as sustainability, welfare or technological advancement in a succinct manner.

A good composite index should follow a few thumb rules:

Validity: What the index is supposed to measure must be crystal clear. The indicators that are used to measure the index needs to convey, in totality, the concept behind the composite index.
Reliability: The sources of data for the various indicators should be acceptable. If the sources are primary, the methodology for data collection should be known; in case it is secondary it should be preferably available on the public domain. In sum, the index should yield similar results if different scholars attempted to measure it.
Parsimony: The whole point of an index is to simplify the measurement of a particular phenomenon. But on the other hand an index that relies upon few indicators undermines its validity.
Uni-directionality: The indicators used to construct the index should be highly correlated with each other to yield a robust result. Some methods like principal component analysis (PCA) reject indicators that are not correlated with others, ensuring high association among the indicators.
Treatment of variables: Depending on the method for aggregation, the indicators used should be made scale-free or normalised. While there are more than one method to achieve this, one of the common ways this can be done is by subtracting the average from each value and dividing it by the standard deviation.
Assigning weights: Any aggregation method needs to decide on the weights to be assigned to each of the variables. These weights could be subjective or equal, and has been the main criticism of scholars who are not in favour of aggregation of indicators. PCA assigns weights in an objective manner that is free of the judgment of the scholar using the method.

Table 1: Weights used in principle component analysis (PCA) and human development index (HDI) methods   *Significant at 1 per cent level (with respect to their correlation with the composite index) # measured as a ratio of non-poor to make it directly related to economic development for the HDI method


Figure 1: Distribution of the indices computed by principle component analysis (PCA) and human development index (HDI) methods
Figure 1: Distribution of the indices computed by principle component analysis (PCA) and human development index (HDI) methods

Methods of Aggregation

There are several methods of constructing a composite index but two methods that are established and commonly used have been discussed here. The first is HDI, which is simple to understand as it takes into consideration variables from three broad domains—health (life expectancy at birth), education (adult literacy and gross enrollment – with two third and one third weights respectively) and wealth (GDP per capita)). These three domains are aggregated assigning equal weights to each domain. The variables are transformed by subtracting the minimum value of the indicator, and dividing it by the range of the indicator. This is known as the range equalisation method and the transformation not only renders the variable scale-free, but also distributes it in a manner that its minimum value becomes 0 and the maximum 1. A variant of this method is the goalpost method of transformation, wherein the minimum and maximum values are determined by a theoretical minimum and maximum and not by the data distribution. For example, literacy can have a minimum value of 0 (all illiterates) and a maximum of 100 (all literates). These two values, in the goalpost method, become the maximum and minimum. HDI uses the goalpost method, and this enables a country to evaluate how far it is from the maximum achievable point.

PCA is a multivariate statistical technique used to reduce the number of variables in a data set into a smaller number of ‘dimensions’ or ‘components’. The HDI method accords weights that are equal to each domain and is hence subjective—one of the points around which much of the criticism revolves. PCA on the other hand assigns weights to each indicator objectively, depending on the extent to which it is correlated with the other indicators in totality. The components are ordered so that the first component (PC1) explains the largest amount of variation possible. This first component is often used as a composite index if it explains a sufficient proportion of variation in the data series that is being compressed to form the index. The problem with PCA however is threefold. Some indicators can be dropped from the aggregation process if they are not sufficiently correlated with the other indicators, irrespective of its theoretical importance in explaining the phenomenon under observation. Secondly, it is a relative index that cannot be compared over time. Thirdly, there are no maximum and minimum that are set for an index computed through PCA.

Article 15 Table 2
Table 2: Deviations in ranks of states in the indices computed through principle component analysis (PCA) and human development index (HDI) method 

The table reveals that majority of the states get the same rank or have a deviation of only 1 between the ranks they are assigned based on the values of the two indices worked out by the PCA and the HDI method. This indicates that the two methods have yielded broadly similar results, barring the cases of the four states of Karnataka, Jammu and Kashmir, Maharashtra and Nagaland.

Article 15 Figure 2

Composite Index: An EXAMPLE

A simple index of regionalising India by its levels of economic development has been calculated using both methods explained earlier. The three variables that have been taken to portray economic development in different states of India are:
Per capita net state domestic product (PCNSDP): It represents the welfare and level of living of a state and district. Indicators like per capita state income are now frequently used by the Planning Commission and Finance Commission for devolution of planned resources to different states.
Percentage of urban population to total: The rural-urban divide is sharp in India, both in terms of economic parameters such as infrastructure endowment and social parameters such as literacy rates etc. The share of urban population is a proxy for overall infrastructure development.
Poverty ratio: This is the ratio of people under the poverty line, which is fixed by consumption levels measured in monetary terms. The poverty line in each state has been fixed differently, adjusted by the respective price index of selected commodities.

It can be observed from table 1 that while the HDI method has equal weights, there is some variation in the weights assigned by the PCA method, and PCNSDP have been assigned the maximum weight. Poverty gets a negative weight since poverty is inversely correlated with the level of economic development. Fig. 1 shows the difference in distribution of the two indices. Table 2 provides the deviation of ranks of the different states derived by the indices of the two methods that indicate that the deviations are the highest for the states which have a significant difference in their relative positioning with respect to the three components/indicators of the composite indices; these states are Karnataka and Jammu and Kashmir. The deviations in ranks are as a result of the difference in weights assigned to the indices by the two methods. Fig. 2 is an example of regionalisation in terms of economic development obtained through PCA.

Leave a Reply

Your email address will not be published. Required fields are marked *