Improving the state’s ability to tax effectively is central to the development process. However, tax collections (as a percentage of GDP) are low in most developing countries — in India the ratio is about 15% — and non-compliance is widely seen as an important problem. A common strategy to evade taxes is to establish shell or ‘bogus’ firms which issue fake receipts to genuine firms to help them reduce their tax burden. The tax authorities have documented the existence of these firms based on actual physical inspections, which are only carried out sporadically given the authority’s limited resource. Tax officials currently are able to identify tax evaders only after manually evaluating available information. Thus, a key challenge in improving tax compliance then is to design a mechanism that can regularly, cheaply and reliably identify such bogus firms. We plan to work with tax authorities in Punjab (in India) to design a proof-of-concept machine-learning (ML) tool that uses ‘big-data’ methods to reduce some of the burden on tax officials in identifying tax-evading firms and answer policy questions on tax compliance using modern methods.

Researchers

Aprajit Mahajan

Aprajit Mahajan is an Associate Professor in the Department of Agriculture and Resource Economics at University of California, Berkeley. He is also a Research Associate at the National Bureau of Economic Research (NBER), and an affiliate of the Abdul Latif Jameel Poverty Action Lab (J-PAL) and the Center for Effective Global Action (CEGA). Aprajit is a development economist with a strong interest in econometric issues, motivated by empirical work. He has worked extensively in India, and recent areas of work include agriculture, health, management, and taxation. Aprajit received his PhD in Economics from Princeton University. He previously taught at Stanford University and the University of California, Los Angeles.

Shekhar Mittal

Shekhar Mittal is a Senior Economist at Amazon. During his PhD Shekhar was interested in development and public economics. His work used large-scale government data sets to better understand government capacity, and to combine these data sets with field interventions to address questions of first-order causal interest. This work began before he joined Amazon. The views expressed in this paper are those of the author(s) and cannot be attributed to Amazon Inc., its Executive Boards, or management teams.

Ofir Reich

Ofir Reich is an independent data scientist conducting data-intensive projects for economic development. He is primarily interested in using data to detect fraud and corruption in weak enforcement settings. He worked on flood prediction in India at Google, was a data scientist for Precision Agriculture for Development, a data scientist at the Center for Effective Global Action (UC Berkeley), Chief Data Scientist and Machine Learning expert for a fraud detection start-up, and a mathematical research team leader in an elite technological unit of the Israeli army. Ofir holds a Mathematics & Physics BSc from the Hebrew University in Jerusalem.