Friday, February 14, 2014

Plainsite: Putting the Law in Plain Sight

A feature on Plainsite has perhaps been long overdue. I first met Aaron almost a year ago at the Codex FutureLaw Conference, and every time I’ve talked to him since, I learn something new. Aaron talks about many of the useful features of his site below, but I thought I’d point out one of his sub-projects—the identification of Intellectual Ventures’ many shell companies. If you hate patent and copyright trolls, all the information on them is there. Do something with it. Go troll the trolls ;) [Note: this is not to be construed as legal advice. I'm not liable if you decide to do something stupid.]

Tell me about yourself.
I started Think Computer Corporation in 1998 as a freshman in high school who wanted something to do other than homework. Not too long after, I encountered my first legal issue: a trademark dispute over my company’s name. It lasted eight years and cost a lot of money to resolve thanks to legal fees. I didn’t really find law—it found me. Ever since then, I’ve had an interest in fixing some of the inefficiencies in the legal system and learned that trying to do anything significant will make the law rear its head. Sixteen years later, Think is still a going concern; along with its sister non-profit, Think Computer Foundation, it runs PlainSite.

Tell me about Plainsite.
PlainSite aims to present the entire legal system to the average user in an understandable manner. That’s no small task—our legal system in the United States is a byzantine labyrinth that has been shrouded in secrecy for decades by the legal profession. It’s a peculiar kind of secrecy, however, because much of the information is actually available to the public but hidden in plain sight—it’s extremely difficult to know where to look in order to connect the dots. PlainSite now has information about 100 million docket entries in almost 6 million case dockets spanning federal and state courts and agencies; 2.5 million patent applications (and growing); 5 million companies, non-profits, government agencies and law firms; over 1 million lawyers; almost 500,000 sections of laws and regulations; over 25,000 judges and examiners, and plenty of other interesting data. And much like a search engine, there’s just one box on the home page where you can go and find what you’re looking for.  Every day approximately 3,000 individuals use PlainSite to access court case dockets and documents, look up corporate profiles, investigate intellectual property assignments, research and annotate federal and state laws, learn about political donations, and view profiles for judges, lawyers and law firms, among other features.

What inspired you to create Plainsite?
PlainSite started out as a side-project dedicated to legal transparency. The question I was trying to answer, along with some friends, was why no one was being prosecuted for the obvious crimes that led to the 2008 financial crisis. (It’s still a good question.) We began by linking social issues to statutes, but it then became apparent that the statutes themselves needed to be linked to court cases, since the law is interpreted every day by judges. Using Aaron Swartz’s PACER data, I was able to put together a database of court cases, and it grew from there.

What’s innovative about Plainsite?
Perhaps the single greatest problem with the legal system in the United States is the lack of freely available information.  The official PACER database charges $0.10USD per “page” to access court documents, and many government agencies use different formats when publishing data, which is hard to find in the first place.  Simultaneously, the legal system is unbelievably complex, and so without information on its inner workings, it is effectively impenetrable to outsiders.

In 2008, information activist Aaron Swartz worked with a team at Princeton University to liberate hundreds of thousands of dockets from PACER, but once the information was technically free, not much happened with it.  PlainSite began as a site that could link social issues to specific sections of the United States Code, but with Aaron Swartz’s PACER data, it also added court cases for free for the first time (and linked those same social issues to court cases as well as statutes).  Now, PlainSite pulls in data not only from PACER via the Princeton RECAP project, but from the GPO, USPTO, IRS, California WQRB, and a variety of other government agencies to offer a comprehensive and understandable look at the legal system.

While other paid legal research services do exist, they are typically priced out of the range of even middle-class individuals, and they work on the assumption that users know the specific legal terms that they are looking for.  PlainSite takes the opposite approach, grouping information into simple, tabbed profiles that require less intense searching.

What have been your greatest challenges thus far?

The main challenges have been technical and competitive. The technical challenges stem from the enormous amount of data that PlainSite tracks; the database is about 70GB right now (and grows every day), which means that just keeping the site up and running (with a staff of one person) is no small feat. And that is after filtering and cleaning up data provided by government agencies that can be 10 or a hundred times as large. It’s truly a "big data" project. The other challenge has been obtaining more data and encouraging paid adoption so that the site can be self-sustaining. We have far more paid subscriptions that we did when we started, but the legal profession is very slow to adopt new tools given the Lexis/Westlaw oligopoly that has been in place for some time. And when the government sometimes charges per page for information, as it does with PACER, it’s that much more difficult to build a service without significant revenue.

Who have been your primary customers?
We offer two plans. Most of our customers so far as pro se litigants, who have signed up for the PlainSite Pro Se plan. It’s a very inexpensive way to conduct legal research at $9.99 per month. We also have PlainSite Pro for general counsel and lawyers, which is $99 per month.

How is your site different from companies like LexisNexis, Westlaw, Casetext, or Fastcase?
All those products are focused around general text search, usually through opinions. While improvements can always be made, those companies general do a good job with that, and the search tools for opinions are pretty mature. Plainsite is different because we structure all other data—not just opinions, but companies, law firms, lawyers, the documents they’re attached to. We focus on anything that isn’t an opinion.

Plainsite has a ton of features. Which one are you the most proud of?
There’s the blog answer, and the nerdy answer. My honest answer—I’m happy at the fact that it actually works. Behind the scenes, there’s a lot of work in taking government data with all its typos and errors, and putting it all together.  Data from governmental sources provide information in wildly different formats, so PlainSite has to clean up the data automatically.  For example, the United States Patent and Trademark Office (USPTO) has assignment databases for both patents and trademarks that are compiled from paper filings.  In these databases, the well-known company IBM is referred to over 1,200 different ways due to various typographical errors and reference mistakes.  PlainSite is designed to automatically detect and fix such errors.  Similarly, law firms regularly change names and cannot be spell-checked since they are named after their partners.  PlainSite merges the seemingly infinite variations automatically in many cases.

But the Motion Sensor feature is the most useful to consumers. It allows the user to read through a docket and understand how a certain judge or lawyer general behaves. Motion practice matters a huge amount. Many cases never get to trial, let alone discovery. Just getting past the initial hurdles is difficult, and it comes down to whether you can survive a Motion to Dismiss. It helps to have all evidence at your disposal. Good legal arguments are important, but so is factual information. If you can find out additional information about your opponent, their company, their counsel, or even how the judge responds to such motions, you have a greater degree of predictability in your case, and the likely outcome of your filing.

There’s been a lot of discussion about compiling info on how judges rule on motions. Have any judges every complained to you about the feature?
Not yet. I did have one federal judge email me saying that we had confused his profile. When I looked into it, it was because the information was poor, and he had been linked to another judge with a very similar name living in the same region, but during the Civil War era. That judge was eager to not be labeled a Confederate judge.

Did you ever think about going to law school?
I thought about it, but decided I’d rather not go. Actually, if you go back to the confederate judge story, there’s a part of the site that lists the educational institutions of judges and lawyers. In that time period, it often lists a judge’s law school as “Read.” This means they sat down with books and learned the law by reading it, not going to law school. Abe Lincoln learned that way as well.

What are your hopes for the company in the future? For the legal industry?
I hope that PlainSite really does change the legal profession by making lawyers more efficient, more affordable, and more accountable for their actions. I hope that in the future, government agencies use Plainsite not just to look up stuff, but to run government systems.

For the legal industry, I hope that prices come down. The end of hourly billing may not be possible, but there needs to at least be a reduction in prices to make legal services affordable to more people.


  1. Go with Pacer or Lexis. This guy is just a web crawler, unfairly posting once password protected information all over google under the guise of "freedom of information" he so obviously will just try selling out.

  2. Go with Pacer or Lexis. This guy is just a web crawler, unfairly posting once password protected information all over google under the guise of "freedom of information" he so obviously will just try selling out.

  3. Totally agree with the above poster. Under the guise of "information should be free", he actually charges attorneys to correct incorrect information he hosts about them on his site. When you contact this site to correct obviously incorrect information about you or your firm (like address information and cited cases), you have to enter a credit card and are charged for correcting incorrect information.

  4. Plain site and its founder have created a corrupt entity that publishes incorrect information (in many cases) and then charges lawyers to act on behalf of individuals to take it down. Average cost is $5,000 to eliminate errors. He effectively blackmails people (that he selectively curates) to remove harmful information.

  5. Plain site and its founder have created a corrupt entity that publishes incorrect information (in many cases) and then charges lawyers to act on behalf of individuals to take it down. Average cost is $5,000 to eliminate errors. He effectively blackmails people (that he selectively curates) to remove harmful information.

  6. Plainsite is charging for "free" information. And at the same time openly giving access to all public records of which many are frivolous lawsuits, and consequently ruining lives through google searches and wrongful associations. This is highly irresponsible and serves no constructive purpose to the public, but is destructive to many.