Since the State Decoded project tracks every mention of every section of a code throughout that code, I thought it might be interesting to look at what the most-cited sections of the Code of Virginia are. The numbers turn out to reveal a bit about the nature of the laws that govern us.
The #1 position is held, by no small margin, by the statement of purpose of the Administrative Process Act, which is precisely as boring as it sounds. (With the important caveat that boring does not equate to unimportant!) In fact, the first seven are all government regulating itself, with the only really interesting one being § 2.2-3700, the first section of the Virginia Freedom of Information Act. The #8 position is a list of definitions that are used throughout the Motor Vehicles title of the code, the #9 law makes rape illegal, and at the #10 spot on our list is the law that prohibits drunken driving.
Of the 30,826 laws in the Code of Virginia, 5,490 (or 17.8%) are cited elsewhere in the code. Just 454 (or 1.5%) are cited 10 or more times.
Why is this interesting? Well, it provides a look at the interconnectedness within the Code of Virginia. Or, really, the lack thereof. One of the goals of this project is to provide a more logical interface for browsing legal codes, instead of the usual, rigid, hierarchical system that divides up most of them. The lack of interconnectedness of Virginia’s code is an indicator that we’ll need metrics other than cross references to establish new groupings of laws, whether internal to the code (such as the shared use of defined terms) or external to the code (shared citations in legal decisions or legislation).
It might be illuminating to compare these data about Virginia to other states as the State Decoded project is implemented elsewhere. Perhaps Virginia is an outlier in its internal cross-pollination, or perhaps it’s perfectly normal.
Spit-balling an idea: Could you use something like latent semantic analysis to determine the topics discussed by statutes, and use that to establish relationships outside of direct mention or hierarchy?
I’d been looking at using Maui for that purpose, but I’ve been worried about that, since Maui lacks a legal-specific dictionary. LSA—which I’d never heard of before!—doesn’t seem (conceptually) to require that sort of preexisting knowledge, which seems to make it a pretty great candidate. Semantic Vectors and Infomap NLP both look like they’d do the trick. The former is especially promising, because it’s Lucene based, and I’m looking hard at using Lucene/Solr as a fundamental part of this project.
Thanks for a fine suggestion, Bill!
I wonder what causes the page to mis-render text placement … see how some of the text overlaps letters and borders :: http://imagebin.org/220282
and the comment box renders misalligned :: http://imagebin.org/220281
this is on chrome browser running on ubuntu 12.04 linux — don’t know if its my OS reading improperly
Good Lord, that’s an almost impressively bad rendering problem, Mike! I’d have a hard time doing that on purpose. I’ll fire up Ubuntu in VMWare, install Chrome in there real quick, and figure out what’s going on. Thanks so much for taking the time to take those screenshots and let me know about this, Mike.