Monday, April 06, 2015

Testing Srini 123 Maps

Testing Srini 123 Multiple Maps. From Google.

This is a Test ...

This is a Test..

Test. Test. Test.

Friday, December 26, 2014

Jammu & Kashmir's near polarised verdict, explained in maps.

The assembly elections in Jammu & Kashmir came up with very interesting results. The results could be summarised as having reflected the name of the state - representing Jammu AND Kashmir. In other words, it was as if the different regions of J&K had completely different political choices in mind and in essentially in effect.

I try to show that through three maps here.

The first one is a map of which party won which constituency (click on the individual constituencies to find out the winner/ runner up /and their respective voteshares). Obviously this is a constituency map prepared from Election Commission Data. (Shoutout to Datameet for helping source this from

Map 1: Who Won Where?

The second map is simply a map that represents the proportion of people adhering to the dominant religions in the state (saffron represents Hinduism, green represents Islam and mild blue - mostly Buddhism among others), across various tehsils in the state. Data for this map is sourced from Census 2001. (To my knowledge, religion wise breakup of the census 2011 for the state is as yet unavailable online. I assume that the proportions haven't really changed much since 2001 even if the actual numbers have risen as is to be expected).

Map 2: Composition of Jammu & Kashmir by Religion (Hindu/Muslim/Others)

A eyeball comparison of the two maps shows how polarised the election was. The BJP simply won heavily in all the constituencies that were Hindu majority by a large margin (in the Jammu region), whereas the parties based in the valley won most of the seats with a Muslim majority. The Congress party did quite well in Ladakh and Kargil, where the chunk of "other religions"- Buddhism in particular- were concentrated.

There is of course Kishtwar, Doda (and to some extent Bhaderwah) with most of its Tehsils having a higher proportion of Muslims among the population, which has been won by the BJP.

A more detailed map that shows how each party performed in each constituency (intensity map showing vote percentages of the four main parties across all constituencies) will elaborate how there is a clear regional divide in the political choices in the state. (Use dropdown at the bottom of the map to choose the respective parties)

Map 3: Vote Percentages of Respective Parties across Constituencies


Sunday, August 24, 2014

Comparing overseas batting records

India's abysmal performance in the recent series against England in that country has been universally panned. While India did manage to win one test and its bowlers performed creditably well (relatively) - running out of luck with dropped catches galore - it was the all round failure of the Indian batsmen that has caught the eye.

But there hasn't much too surprise with the Indian record in England recently. Indian batsmen have traditionally struggled outside the sub-continent as they have to encounter either faster pitches, better seaming and swing conditions or tracks that aren't flat enough. Indian pitches, on the other hand, are relatively more conducive to turn, include a number of flat tracks, and are more difficult for faster bowlers than is the case elsewhere. That is the commonly understood story.

Is there to empirically verify this using nifty data visualisation tools? There is!

We set out to find if Indian batsmen are relatively worse off than the average batsmen elsewhere on overseas tracks.

What we do here is to not just use simple averages to compare batsmen, but to use a measure which is called, "Runs over average batsmen" for our purposes.

It is not enough to simply compare averages of batsmen on overseas pitches as this measure will not compare a player from one era to another. That is because a batsman in a particular era could face better bowlers (or worse) as compared to another. There are also various rule changes/ cricketing conditions (one bouncer per over since the 1990s or no helmets prior to the mid-1970s for examples). It is therefore simply not accurate to term that X with an average of 50 in the 1990s and who has played just 25 innings overseas is better than Y with an average of 40 in the 1970s and who has played 60 innings.

Therefore, what we ought to do is find the average number of runs scored in a particular set of years in which a player played, and then calculate the difference between the total number of runs scored by the batsman and this average. This will be the "overall_value" of the batsman. It is an intuitive idea that is similar to what Australian economist Nicholas Rohde used in his controversial paper to study batting records across times.

To illustrate, take VVS Laxman. He has an overall average of 45.97. His overseas average is 42.64. He has played 225 innings (34 not outs) in his career. How does he compare to someone like Gundappa Viswanath, a similar stylist from the past? During VVS Laxman's career between 1997 and 2012, the average number of runs scored by batsmen was 33.1. In overseas tests, VVS Laxman added a difference of 8.92 (42.02 - 33.1) and therefore contributed 8.92 * 109 (such innings played) = 1040.1 runs as his overall value added as compared to the average batsman of his era. Similarly, Viswanath's overall value added was 312.39 runs over the average player of his era.
We do this exercise for all batsmen who have played test cricket from 1877 to the present. And present the results in a nifty graph as below. (Hover on each cell to see the data. Lighter colours depict a better value and darker a lower value for the batsmen. Click on the countries to view country specific data).


What we notice here is that there is not too much of a difference in the overall overseas records of Indian batsmen as compared to the best of the cricketing world. India does have a sizeable number of batsmen having above average "value added" runs in overseas tests as compared to the top team, Australia. Among Indian batsmen though (click on India to view more details for Indian batsmen alone), it is evident that the previous generation of batters- Sachin Tendulkar, Rahul Dravid, VVS Laxman and to a lesser extent, V. Sehwag and S. Ganguly, constituted the best ever core India has had since it entered test cricket. Barring Sunil Gavaskar and Mohinder Amarnath in the earlier generation, no other batsman of any other era has a better overseas record than the aforementioned.

The current generation, meanwhile, has a long way to go to live upto the record of the previous one. Barring A. Rahane to some extent, most other Indian batsmen of the current team has been poor on overseas tours relative to the average batsman of this era.

Saturday, August 16, 2014

The place of Rangana Herath

Rangana Herath just completed 250 wickets after taking a fantastic 9/127 against Pakistan in an ongoing test (as I write this) played at the SSC, Colombo. 

The event prompted me to check out whether this diminutive, unheralded, unsung and hardworking bowler stood among his tribe of spin bowlers. 

I did a simple data comparison. Extracted the Top 20 spin bowlers (by wickets taken) from Cricinfo's Statsguru, and then calculated a metric - "Bowl Index" (copied from this source). "Bowl Index" basically takes into account both bowling average and strike rate. 

Here's the formula: (runs conceded)^2/(balls bowled * wickets taken)

And then I normalised the formula to account for total innings bowled (Bowl Index * 1000/Total Innings Bowled). 

The resulting data is as below: 

A Graphical representation of the above data list is below:

Rangana Herath, thus far, ranks just below Bishen Singh Bedi and Clarrie Grimmet in the all-time list. Not bad at all for the Lankan spin lynch-pin whom no one expected to take over the giant shoes of Muthiah Muralitharan. 

Tuesday, July 29, 2014

Summary of recent data related pieces written by me.

Over the past year, I have written a number of pieces as part of data journalism. All of the pieces are election-related (but of course it was a major election year in India). Links to the pieces (with short descriptions) that I have written are presented here: 

a) The Aam Aadmi Party's win in the Delhi Elections. (Written for the EPW Web Exclusives). The article tries to use GIS tools to understand the reasons favouring the AAP's win in the Delhi assembly elections. It comes up with interesting insights: Link

b) Articles related to the Lok Sabha elections in the EPW: 

i) Explaining the high turnout in the 2014 elections: Link . This article was written as elections were underway using preliminary voter turnout information released by various Chief Election Officers of different states (and UTs) in India. It seeks to explain the very high turnout in the 2014 elections and identifies variances, unexplored reasons. 

ii) Preliminary statistics from the 2014 election results: Link This article was written post the Lok Sabha elections and provides visualization of results - voteshares, constituency winners and losers, party performance and so on. 

c) A case for proportional representation in Uttar Pradesh: (written for the site, in 2012 and basically an analysis of the election results in the state then. The slightly misleading headline is not mine). Link

d) Recent pieces in, written in March-June this year (as part of a Data Journalism fellowship): 

i) On Fragmentation in India's political system over the years (since 1977) and regionalisation: Link. I seek to show the regionalisation emphasis in India since 1977. 

ii) What sways the urban voter? Results of a survey conducted by the Association of Democratic Reforms (ADR) among others. Link. This article uses survey data to highlight urban voter choices across the country. 

iii) The AAP's Performance in Punjab - its salient features. Link. I use polling booth data and other insights to find out whether the AAP's performance in the Punjab LS polls replicated its Delhi victory from last year. I find out that the reasons for the Punjab victory were different from the Delhi victory

iv) Explaining the many reasons for the UPA's defeat. Link. Here I used regression techniques to filter out reasons for the UPA's defeat through analysis of available empircal information. 

v) Voter Turnouts across India and explaining the variance: Link . An extended version of the article written in the EPW on voter turnouts, this time correlating these with survey data on voter preferences. 

e) I curated this election special page using data from all Lok Sabha elections since 1977 and narratives for the EPW:link. Please see bottom of page for election statistics from 1977 onwards. 

f) I managed to scrape data from Election Commission's live results page and run a visualisation on maps using Google Fusion Tables live during three assembly elections for North East states in 2013. Location:

Forthcoming & Pending in late 2014: 

An article on the West Bengal Lok Sabha election results. An indepth look at the results using polling booth level data from both the 2014 LS polls and the 2011 assembly elections. 

A research article on explaining the presence/absence of the "incumbency effect" over the years using disaggregated survey data provided by the CSDS' Lokniti. 

Friday, June 07, 2013

A keen contest in the offing

It's been a while since I have blogged. I wrote a preview of the NBA finals being played out in the USA for an Indian audience. (The first game between the San Antonio Spurs and the Miami Heat has already been completed with the former registering a close win to go up 1-0 in the 7 game series). Here goes -

To the average sports fan in India, there is not much of a world in sports beyond cricket and to be more accurate, beyond the hugely successful and yet controversial Indian Premium League and Twenty 20 cricket. But to the Indian exposed to watching high quality sporting action on television over the years, the basketball played in the National Basketball Association in the United States must rank among the most spectacular sporting extravaganzas alongside European football, World Cup football and Formula One racing.
To this aficionado, there is no better sport that encapsulates athletic ability, naturally given body strength, dexterity, chess-like strategizing and execution than the basketball played at the NBA. Basketball with its emphasis on skill, strength, speed, team work and brain power is surely among the most evolved sports today and the soon to be played NBA finals (from June 7th onwards) between the Miami Heat and the San Antonio Spurs promises to be a reflection of the best of the above mentioned characteristics of the game.
That is because both these teams are such a contrast to each other. The Heat are the powerfully constructed defending champions who comprise the best basketball player of this generation – Lebron James who defies simple characterization by being a scorer, finisher, passer, defender and orchestrator in equally high quality measures and of such a versatile kind that was rarely seen in the past (only erstwhile greats Michael Jordan and Oscar Robertson come close). Lebron James is flanked by a flamboyant shooting guard in Dwyane Wade, a fellow 2003 NBA draftee, who singlehandedly won the Heat its first championship in 2006 and an athletic forward in Chris Bosh, besides sharpshooting role players in Ray Allen, Shane Battier and Mike Miller. The Heat have become an even more formidable team since its victory last year by transitioning into a well oiled machine that combines superlative wing play with high octane defense. The Heat evoke part measures of respect and disdain among the sport watching public in the USA who note that the construction of the squad was made possible through “star collusion” as James, Wade and Bosh came together, during their respective free agencies, promising to dominate the NBA landscape for years to come. Since then, they have reached three consecutive NBA finals, winning one of them last year and as favourites, are poised for another triumph this year.
The San Antonio Spurs, on the other hand, were more organically constructed. The “winningest” franchise in American sport for more than the past 15 years, the Spurs in that period have had a few constants in their squad – the phlegmatic metronome of a “big man”, Tim Duncan for that entire period, the effervescent, exciting and creative savant in Argentine Manu Ginobili and the younger but mature and speedy point guard in Frenchman Tony Parker, who have been coached by long time coach Gregg Popovich. The Spurs evoke mostly respect from opponents and basketball enthusiasts who admire their selfless team play and their executive office’s foresight in constructing the squad through meticulous scouting, focus on player development and staying true to a near idealistic basketball philosophy. No one believed that the Spurs would endure the latter years of their lynchpin Tim Duncan’s nearly one and half decade long career and continue to remain a contender – not since 2007 when they last won their NBA championship. But savvy personnel decisions, perseverance from their veteran players and their “front office” – general manager RC Buford in particular – has brought them back to contention. The fact that the Spurs have been a “small market team” which has managed to construct a winning squad despite keeping their spending low and within “salary cap” limits generally is also noteworthy. In contrast, the Heat pays luxury tax (a punitive tax that is imposed upon teams crossing the salary cap and exceptions) because of huge salaries for their three main players.
On the court, the Spurs rely more on team work, unselfish play, ball movement and complementary play than the Heat, but the latter has also, over the year, stolen some pages from the four time champion’s playbook. The Spurs’ veteran core has been patient in trying to peak at the right time and remain healthy for “money time” – the playoffs. The team has clinically dismantled their playoff opponents – an injury ridden Los Angeles Lakers, a hot shooting, perimeter play dominated Golden State Warriors and a defense first, “big man” play reliant Memphis Grizzlies – by adjusting their strategies duly for their opponents’ strengths and maximizing their own potential to the hilt. The Heat, on the other hand, has used their inherent talent advantage to pummel their opponents but has come off a very difficult Eastern Conference Finals against the Indiana Pacers who stretched them to the maximum 7 games.
On paper, the Heat are superior. Their advantage lies in the fact that Lebron James does not have a viable challenger who could defeat him one-on-one and the Spurs don’t necessarily have an answer despite fielding their own young defensive specialist Kawhi Leonard to slow James down. But the Spurs are the best opposition that the Heat are going to face so far in the post-season and are strong in some areas which are weak-spots for the Heat. The Spurs are a terrific 3 point scoring team, have the patience to keep churning their offense through the shot clock to find the open man and have the tenacity to play help defense as good as the relatively offensively challenged but defensively strong Indiana Pacers. Will the collective strength of a savvy team like the Spurs be enough to overcome the stellar Miami Heat? This writer’s expectation is that the Spurs will come up trumps in six grueling, exciting games but there is no guarantee.
This promises to be a close contest and an exhibition of exquisite basketball skills for sport fans. Indian viewers can watch the games live on Sony Six (mostly telecasted early mornings Indian standard time).