Improving the state’s ability to tax effectively is central to the development process. However, tax collections (as a percentage of GDP) are low in most developing countries — in India the ratio is about 15% — and non-compliance is widely seen as an important problem. A common strategy to evade taxes is to establish shell or ‘bogus’ firms which issue fake receipts to genuine firms to help them reduce their tax burden. The tax authorities have documented the existence of these firms based on actual physical inspections, which are only carried out sporadically given the authority’s limited resource. Tax officials currently are able to identify tax evaders only after manually evaluating available information. Thus, a key challenge in improving tax compliance then is to design a mechanism that can regularly, cheaply and reliably identify such bogus firms. We plan to work with tax authorities in Punjab (in India) to design a proof-of-concept machine-learning (ML) tool that uses ‘big-data’ methods to reduce some of the burden on tax officials in identifying tax-evading firms and answer policy questions on tax compliance using modern methods.