A feature on Plainsite has perhaps been long overdue. I
first met Aaron almost a year ago at the Codex FutureLaw Conference, and every
time I’ve talked to him since, I learn something new. Aaron talks about many of
the useful features of his site below, but I thought I’d point out one of his
sub-projects—the identification of Intellectual Ventures’ many shell companies. If
you hate patent and copyright trolls, all the information on them is there. Do
something with it. Go troll the trolls ;) [Note: this is not to be construed as legal advice. I'm not liable if you decide to do something stupid.]
Tell me about
yourself.
I started Think Computer Corporation in 1998 as a freshman
in high school who wanted something to do other than homework. Not too long
after, I encountered my first legal issue: a trademark dispute over my
company’s name. It lasted eight years and cost a lot of money to resolve thanks
to legal fees. I didn’t really find law—it found me. Ever since then, I’ve had
an interest in fixing some of the inefficiencies in the legal system and
learned that trying to do anything significant will make the law rear its head.
Sixteen years later, Think is still a going concern; along with its sister
non-profit, Think Computer Foundation, it runs PlainSite.
Tell me about
Plainsite.
PlainSite aims to present the entire legal system to the
average user in an understandable manner. That’s no small task—our legal system
in the United States is a byzantine labyrinth that has been shrouded in secrecy
for decades by the legal profession. It’s a peculiar kind of secrecy, however,
because much of the information is actually available to the public but hidden
in plain sight—it’s extremely difficult to know where to look in order to connect
the dots. PlainSite now has information about 100 million docket entries in
almost 6 million case dockets spanning federal and state courts and agencies;
2.5 million patent applications (and growing); 5 million companies,
non-profits, government agencies and law firms; over 1 million lawyers; almost
500,000 sections of laws and regulations; over 25,000 judges and examiners, and
plenty of other interesting data. And much like a search engine, there’s just
one box on the home page where you can go and find what you’re looking for. Every day
approximately 3,000 individuals use PlainSite to access court case dockets and
documents, look up corporate profiles, investigate intellectual property
assignments, research and annotate federal and state laws, learn about
political donations, and view profiles for judges, lawyers and law firms, among
other features.
What inspired you to
create Plainsite?
PlainSite started out as a side-project dedicated to legal
transparency. The question I was trying to answer, along with some friends, was
why no one was being prosecuted for the obvious crimes that led to the 2008
financial crisis. (It’s still a good question.) We began by linking social
issues to statutes, but it then became apparent that the statutes themselves needed
to be linked to court cases, since the law is interpreted every day by judges.
Using Aaron Swartz’s PACER data, I was able to put together a database of court
cases, and it grew from there.
What’s innovative
about Plainsite?
Perhaps the single greatest problem with the legal system in
the United States is the lack of freely available information. The official PACER database charges $0.10USD
per “page” to access court documents, and many government agencies use different
formats when publishing data, which is hard to find in the first place. Simultaneously, the legal system is
unbelievably complex, and so without information on its inner workings, it is
effectively impenetrable to outsiders.
In 2008, information activist Aaron Swartz worked with a
team at Princeton University to liberate hundreds of thousands of dockets from
PACER, but once the information was technically free, not much happened with
it. PlainSite began as a site that could
link social issues to specific sections of the United States Code, but with
Aaron Swartz’s PACER data, it also added court cases for free for the first
time (and linked those same social issues to court cases as well as
statutes). Now, PlainSite pulls in data
not only from PACER via the Princeton RECAP project, but from the GPO, USPTO,
IRS, California WQRB, and a variety of other government agencies to offer a
comprehensive and understandable look at the legal system.
While other paid legal research services do exist, they are
typically priced out of the range of even middle-class individuals, and they
work on the assumption that users know the specific legal terms that they are
looking for. PlainSite takes the
opposite approach, grouping information into simple, tabbed profiles that
require less intense searching.
What have been your
greatest challenges thus far?
The main challenges have been technical and competitive. The
technical challenges stem from the enormous amount of data that PlainSite
tracks; the database is about 70GB right now (and grows every day), which means
that just keeping the site up and running (with a staff of one person) is no
small feat. And that is after filtering and cleaning up data provided by
government agencies that can be 10 or a hundred times as large. It’s truly a
"big data" project. The other challenge has been obtaining more data
and encouraging paid adoption so that the site can be self-sustaining. We have
far more paid subscriptions that we did when we started, but the legal
profession is very slow to adopt new tools given the Lexis/Westlaw oligopoly
that has been in place for some time. And when the government sometimes charges
per page for information, as it does with PACER, it’s that much more difficult
to build a service without significant revenue.
Who have been your
primary customers?
We offer two plans. Most of our customers so far as pro se
litigants, who have signed up for the PlainSite Pro Se plan. It’s a very
inexpensive way to conduct legal research at $9.99 per month. We also have
PlainSite Pro for general counsel and lawyers, which is $99 per month.
How is your site
different from companies like LexisNexis, Westlaw, Casetext, or Fastcase?
All those products are focused around general text search,
usually through opinions. While improvements can always be made, those companies
general do a good job with that, and the search tools for opinions are pretty
mature. Plainsite is different because we structure all other data—not just
opinions, but companies, law firms, lawyers, the documents they’re attached to.
We focus on anything that isn’t an opinion.
Plainsite has a ton
of features. Which one are you the most proud of?
There’s the blog answer, and the nerdy answer. My honest
answer—I’m happy at the fact that it actually works. Behind the scenes, there’s
a lot of work in taking government data with all its typos and errors, and
putting it all together. Data from
governmental sources provide information in wildly different formats, so PlainSite
has to clean up the data automatically.
For example, the United States Patent and Trademark Office (USPTO) has
assignment databases for both patents and trademarks that are compiled from
paper filings. In these databases, the
well-known company IBM is referred to over 1,200 different ways due to various
typographical errors and reference mistakes.
PlainSite is designed to automatically detect and fix such errors. Similarly, law firms regularly change names
and cannot be spell-checked since they are named after their partners. PlainSite merges the seemingly infinite
variations automatically in many cases.
But the Motion Sensor feature is the most useful to
consumers. It allows the user to read through a docket and understand how a
certain judge or lawyer general behaves. Motion practice matters a huge amount.
Many cases never get to trial, let alone discovery. Just getting past the
initial hurdles is difficult, and it comes down to whether you can survive a
Motion to Dismiss. It helps to have all evidence at your disposal. Good legal
arguments are important, but so is factual information. If you can find out
additional information about your opponent, their company, their counsel, or
even how the judge responds to such motions, you have a greater degree of
predictability in your case, and the likely outcome of your filing.
There’s been a lot of
discussion about compiling info on how judges rule on motions. Have any judges
every complained to you about the feature?
Not yet. I did have one federal judge email me saying that
we had confused his profile. When I looked into it, it was because the
information was poor, and he had been linked to another judge with a very
similar name living in the same region, but during the Civil War era. That
judge was eager to not be labeled a Confederate judge.
Did you ever think
about going to law school?
I thought about it, but decided I’d rather not go. Actually,
if you go back to the confederate judge story, there’s a part of the site that
lists the educational institutions of judges and lawyers. In that time period,
it often lists a judge’s law school as “Read.” This means they sat down with
books and learned the law by reading it, not going to law school. Abe Lincoln
learned that way as well.
What are your hopes
for the company in the future? For the legal industry?
I hope that PlainSite really does change the legal
profession by making lawyers more efficient, more affordable, and more
accountable for their actions. I hope that in the future, government agencies
use Plainsite not just to look up stuff, but to run government systems.
For the legal industry, I hope that prices come down. The
end of hourly billing may not be possible, but there needs to at least be a
reduction in prices to make legal services affordable to more people.