Predictive Analytics in Enforcement: Searching for Regulatory Violations

Posted by David Kretch on 7/28/15 11:50 AM

Find me on:

Read more about me: Biography


*This is the second installment of our new blog series: Predictive Analytics in Enforcement. See our first post: What is Predictive Analytics?*

Regulation enforcement is one of the government’s biggest responsibilities. One of the ways that government agencies enforce regulations is by finding violators and punishing them, typically by levying fines. However, the agencies tasked with enforcing these regulations have a monumental task–they must regulate many, many organizations while managing relatively limited resources.

Predictive analytics allows regulatory agencies to deploy their resources more effectively when searching for violators. Finding violators is often like looking for needles in a haystack. Effective predictive analytics helps regulatory agencies by giving them a metal detector to much more easily separate the needles from the hay. However, predictive models are never flawless–sometimes they find a nail instead of a needle. Regardless, the better that regulatory agencies can pick out potential violators, the more time they can spend on enforcement and the less time they need to waste on ultimately fruitless investigations of non-violators.

The data used in predictive analytics comes from many sources: required annual reports, tax records, and so on. Information derived from these–for example on the size of organizations, how they allocate their money, or who they deal with–typically form the 'features' or 'predictors' that are used to do the prediction. What we are trying to predict, violation, is typically taken from previous investigations. So when we build these predictive models, we ask: based on data where we know whether an organization is violating regulations, what is it that we knew at the time that would tell us that they were violating in advance?

One common type of predictive model used in enforcement seeks to classify the organizations that a given agency is responsible for regulating into groups, e.g. non-violators and violators. Data analysts can create more complicated models aimed at classifying specific kinds of violators (e.g., by severity or type of violation). The predictive models generally do this classification by predicting a probability of violation for each organization. The agency can then focus on the organizations with high probability of violation.

For one of our clients, Summit built a predictive model to predict how long organizations that were subject to fines would take to pay them using a technique called survival analysis. This model takes advantage of data that the regulated organizations are required to submit to government, along with commercial databases of financial information. This predictive model allows our client to focus their resources on the organizations that were likely to be delinquent (i.e., whose predicted time to payment was greater than the time allowed for payment).

In our next blog posts, we continue discussing predictive models, specifically some of the technical issues arising from sources and dataset limitations.

For more information about predictive analytics, check out Summit Principal Albert Lee's recent publication in the AGA's Journal of Government Financial Management: "Predictive Analytics: the New Tool to Combat Fraud, Waste and Abuse"

Topics: predictive analytics

About the Summit Blog

Complexity simplified.

Summit is a specialized analytics advisory firm that guides clients as they decode their most complex analytical challenges. Our blog highlights the strategies and techniques we use, as well as relevant topics in current events.

Subscribe to Email Updates

Recent Posts

Posts by Topic

see all