We the People: Analyzing Comments on the Federal Register

In the first post of our series on the Federal Register, we talked about the “what” and the “why”: what are public comments on proposed rulemaking, and why is it so important to be thinking about them? In this installment, we’ll dig into the “how”: how do we actually do a thematic analysis of Federal Register comments?

The coding process

As we discussed last week, the first component of the work is a thematic analysis of the main arguments supporting or opposing the proposed rulemaking, and NVivo is our software of choice to help us with that. First, we upload all comments into NVivo to begin coding relevant themes.

Coding is basically the process of “tagging” sections of text with relevant thematic codes (text can be coded to multiple topics). The image below is a hypothetical example of coding—this isn’t what it actually looks like in NVivo, but it gives a sense of how text is “tagged” to the relevant codes.

NVivo example

The coding hierarchy

But where do these codes come from? How do we know to code for “impact on community” and “impact on ecosystem,” for example? The creation of the list of codes (known as a coding hierarchy) is a flexible process that is revisited throughout the analysis. Typically, we begin by reviewing a small portion of the comments to identify themes being raised. Based on what we see, we create an a priori coding hierarchy—essentially, we’re making an informed guess about what we think we might see in the rest of the comments. This is a great place to pull in subject-matter experts so that the coding hierarchy is informed both by what we’re seeing in the first portion of comments and by expert knowledge of the relevant field.

This coding hierarchy isn’t set in stone—it’s flexible! As we continue coding, let’s say we uncover an interesting new theme, such as support for the construction of a new tourist resort on the moon because it would create jobs for underemployed lunar residents. We can add that theme to the coding hierarchy and continue looking for it in other comments.

At the end of analysis, we revisit this coding hierarchy, asking questions like: Should some of these codes be combined or split up? Are some of these really just one-offs? Using the example in the paragraph above, we thought the argument for lunar job creation might be a common one. But after coding hundreds of comments, maybe that was the only one that raised it. In this case, we could combine these one-offs together into an “other” code highlighting less commonly endorsed arguments.

Another option might be to combine or separate codes, depending on what we found. Using the same example, let’s say one of our codes was “job creation”—we might find at this point that there are two different themes here. Some commenters discussed job creation for underemployed lunar residents, while others were focused on job creation for earthlings. Depending upon the proposed rulemaking, we might consider these themes separate enough to split the code.

Lunar construction crop

Ensuring consistency

Consistency is the name of the game in NVivo. When we’re coding, we need to make sure that all analysts have the same understanding of the coding hierarchy and are applying the codes the same way. This is how we know the analysis is completed as intended, rather than influenced by one analyst’s interpretation of a theme. So how do we do that? There are a handful of best practices that help (such as training analysts on the coding hierarchy and creating a hierarchy that includes clearly written definitions of each code), but the key approach is double coding.

In double coding, a small percentage of all documents are coded by two different analysts. The analysts then meet and discuss the coding, looking for any places where they did not code the same way. This is done iteratively, until all analysts are applying the codes in the same way. In addition to being a best practice for a rigorous analysis, double coding is also a great opportunity of further refining the coding hierarchy because it necessitates discussion around what analysts are seeing in the data and how well the hierarchy fits what they’re seeing.

Double coding crop

Categorizing comments

The second component of our analysis is classifying the comments received, which will contextualize the themes identified during coding. For example, each comment will be categorized by type (form letter, duplicate, non-relevant, etc.), whether it contains substantive or non-substantive comments, and commenter name and type (nonprofit, individual, for-profit company, educational institution, etc.).

So why do we care what “type” each comment is? This can provide a lot of information to contextualize the findings of our thematic analysis. For example, we’d want to know if the most commonly raised concerns came from many different commenter types, or whether 95% of them were sourced from the same form letter submitted by multiple people. This type of information provides the context to know how to weigh these comments.

Pulling it all together

Finally, after coding and categorizing, the last piece of analysis is to put both components together. This is one of the reasons that we love using NVivo for this type of work. NVivo’s flexibility lets us combine the qualitative, thematic analysis we’ve conducted and the categorization of comment types to create the most nuanced analysis possible.

We start by examining the main themes and looking for any patterns that pop up within or across groups. For example, perhaps we find that the bulk of comments supporting the construction of a lunar resort because it would lead to job creation come from for-profit construction companies, while the majority of comments opposing the construction due to permanent damages to the local ecosystem come from nonprofit organizations. This type of information allows for a more nuanced interpretation of our findings.

Conclusion

And that concludes our summary of how we use NVivo to analyze public comments on proposed rulemaking on the Federal Register. This type of analysis will likely be in high demand as we finish the Biden administration’s first 100 days in office; but even looking forward, this type of work remains important to ensure that the people have a voice in their government.