Where Do Federal Judges Come From?

Below is a memo for a data story from Mark Hansen's Data II class at Columbia Journalism School, in which I received honors. The process and story are described in-depth below. R was a new software for me, but fun to work in.

My full R code is accessible here.

The original dataset is available on the Federal Judiciary Center website. Here's my simplified CSV of that data. 


Unlike every other developed nation in the world, the United States elects its judges, at least on a state and local level. Recently, those judicial elections have seen a dramatic increase in spending, with direct campaign dollars more than doubling between the 1990s to 2000s ($83M to $206$M) and outside “dark” money recently entering the financial race at a fast pace (see Mother Jones).

For my master’s project, I’m focusing on a Supreme Court case for this term that may overturn the widely accepted prohibition on judge candidates directly seeking soliciting campaign funds. The case, Williams-Yulee v. Florida Bar, arose following a circuit split between various cases on similar “personal solicitation” clauses in the codes of professional conduct for judge candidates. 

The federal bench (whose judges are appointed for life rather than elected) tends to be critical of judicial elections and the very concept of campaigning for a judgeship. As a result, argues Florida State University professor Rob Atkinson, an expert on legal professionalism, “federal courts have been much more willing to tolerate restrictions on judicial elections.” Federal judges are not created from thin air though. They’re appointed by presidential administrations drawing on a variety of sources, with the state and local judiciary being one dominant such source. 

As I report and write on this current Supreme Court case, the influence of money in state and local judicial elections, and the legal questions at hand, I’ve found the federal and state/local benches spoken about in isolation. With so few federal appointments each year, they are substantively different. However, federal courts rely on state/local courts as a major source of candidates for federal judgeships and a corrupting influence in that source could flow to the federal bench.


Data Question

How large a source of federal judge candidates are state and local judges? How has that number changed in recent years? How does that break down across parties and demographic detail?

Setting Up the Data

Drawing on a complete directory of biographical information on all federal judges ever appointed in American history, I created a database of the backgrounds of all appointed judges. The data comes from the Federal Judicial Center, the official “research and education agency of the federal judicial system," and is accessible at: http://www.fjc.gov/history/home.nsf/page/judges.html.

The data span 1789 to 2014 and entail 3,542 judges appointed to various federal benches (District Courts, Courts of Appeal, Supreme Court, and Customs Courts – as well a small number of courts like the Court of Claims that no longer exist).

The “employment” field is a long string of biographical professional information about the judge that gives, in chronological order, his or her notable professional roles prior to appointment. Here’s former Chief Justice William Rehnquist’s employment field as an example:

“U.S. Army soldier, Air Corps, 1943-1946<BR>Law clerk, Hon. Robert H. Jackson, Supreme Court of the United States, 1952-1953<BR>Private practice, Phoenix, Arizona, 1953-1969<BR>Assistant attorney general, Office of Legal Counsel, U.S. Department of Justice, 1969-1971”

Also present in the data and carried through my creation of the database are the following relevant variables:

  • Name
  • Birth Year
  • Various dates used to signify date they began on the bench (described later)
  • Court name
  • Court type
  • ABA Rating (e.g. “Qualified,” “Well Qualified,” – most blank)
  • Gender
  • Race / Ethnicity
  • Party affiliation of the President that appointed them
  • Name of the President that appointed them
  • Employment (mentioned above)

All the build and analysis on this data is performed in R. Assuming the Federal Judicial Center doesn’t dramatically change the fields they produce this data with, this analysis can use future updated datasets from the same source. (Troubleshooting to ensure full coverage of the job type and date variables would be the only necessary updates and I included commented out code that would allow for simple checks on these two points.)

Building the Data – Employment

First, I create various categories for the kinds of prior employment federal judges held. The code is fairly self-explanatory, but I’ll broadly describe the categories and their meanings here:

  • Judge – State or local judge
  • Military – Army, Navy, Marine, JAG etc.
  • Government – Public servant, typically a prosecutor like an Attorney General, U.S. Attorney, or a City Attorney, but also includes employment at federal agencies or branches of government (e.g. Chief Counsel for the Senate Subcommittee on the Judiciary) and clerkships
  • Private – An attorney in private practice or in-house counsel to a corporation
  • Academic – A law professor or instructor
  • Politician – An elected official in a legislative body or direct employee of a political party

Those are the six main categories. Two others are worth noting though:

  • Blank – three observations have blank “employment” fields
  • Scientist – two judges were scientists at one point, but were not scientists at the time of their appointment 

Using three levels of nested for loops for the seven non-blank job types:

  • I create seven dummy variables for each job type that read ‘1’ if a judge ever held that job type in their past.
  • I create a ‘prior’ variable that gives the most recent such job type that the judge held before joining the federal bench 

Per the Rehnquist example above, the former Chief Justice’s employment history codes as follows:

Building the Data – Date

The Federal Judiciary Center gives dates in an unfortunately opaque and inconsistent manner, so a good deal of cleaning, checking, and patching was necessary to create a single final date variable that represented the exact date each person joined the federal bench.

First, dates were inconsistently given in “mm/dd/yy” and “mm/dd/yyyy” regimes. The latter regime seemed to only occur for dates prior to 1900. Speculating, I imagine they produce their csv with Excel at some point, so all the pre-1900 dates become character strings while the post 1900 dates are read into Excel’s date format (which uses January 1, 1900 as its numeric “Day 1” and counts up thereafter).

Three date variables are relevant stand-ins for “joining the federal bench.” The first variable is the Commission Date, the day a judge literally begins his or her commission as a judge. This variable covers all but 25 of the 3,542 observations in the data. As sufficient proxies, I patch the 25 observations with their appointment date, first using Recess Appointment Date (24 observations patched) and then using Senate Appointment Date (1 observation patched).

With complete coverage in a date variable as character string, I then disambiguate that character string into an actual date variable. Here the code is a fairly straightforward for loop of substrings on the date variable.

The substrings had on problem though: “mm/dd/yy” coding works for two different centuries. And so, I needed to separate out, for instance, judges appointed in 2004 from judges appointed in 1904.

I use the birth year variable to make this fix. Any judge born in the 19th century, but appointed in the 21st century has the appointment knocked back to the 20th century. Making the reasonable assumption that no judges were appointed to the bench while over 100 years old, this change corrects the potential misattributed dates.

With complete and correct dates, I then bring the disambiguated day, month, and year character strings into a single date variable that uses the international date regime of “yyyy-mm-dd.” After that, I overwrite those day, month, and year character strings with numeric versions using lubridate functions on the date variable.


Analyzing the Data

A number of interesting patterns emerging from this finalized database.

First, the overall number of judges in the United States has increased dramatically across the United States as the judiciary has grown.

Given the disparity in counts, from here on, the analysis pertains only to data from 1900 to present. 

Party affiliation does appear to have minor differences in what professions the appointing President draws from for federal judges.

or, more simply:

Democrats have appointed more judges than Republicans. Of those judges, the two parties largely assign similar portions of profession to the bench. Some clearest discrepancies appear though:

Some intuitive

  • Republicans appointed more private attorneys
  • Democrats appointed more politicians

 Some counter-intuitive

  • Republicans appointed (slightly) more government officials
  • Democrats appointed (significantly) more military officers (early in the twentieth century, when military appointees were more common, Democrats were more often in the White House during and immediately following major wars, e.g. Wilson, FDR, Truman…)

 Both parties have appointed from academia at very similar rates.

As for judges, the focus of this analysis, Democrats have appointed from the state and local bench at a very slightly higher rate.


Over time, across parties, state and local judges are the most common current source of federal judges. That trend continues to grow, after a post-wartime trough.


A similar trend exists for appointees from the private sector. The growth is even sharper in recent years, with formerly private attorneys potentially overtaking state and local judges soon in the overall appointment trend. According to Elizabeth Warren, conservative Senators have increasingly litmus-tested judges and, as a result, the Obama administration relies on the private sector more. She said in February that “it's unsurprising that the president and a majority of the Senate gravitated to nominating corporate lawyers…that most conservative senators could not object to.” As for Republican administrations, a preference for the private sector exists already, as defined earlier.

Warren and other Democrats have called for more public servants to be nominated (though Democratic Presidents actually have historically nominated from government slightly less than Republicans). The appointment of public sector employees brought into the federal has indeed been relatively flat, but has risen somewhat since the 1970s.



 Comparing all six separate categories together in one chart offers a more complete picture though.

Here’s a lattice of all the judges appointed after 1900:

And the same lattice using smoothed trendlines:

The trends mentioned earlier are clearly visible. Interestingly, the direct appointment of elected officials (politicians) has declined precipitously.

Further Steps

  1. The categories create are based on personal interpretation and handpicking character strings. This process is open to two potential issues:
    1. My interpretations could be incorrect.
    2. The character strings might mistakenly categorize
      1. As an example, the string “Justice” could be contained in “Department of Justice” (government) or “Justice, Superior Court” (judge). In this instance, I disambiguated this issue by checking instead for “Department of Justice” and “<BR>Justice.”
      2. Clearly, other such errors could be present in the categorization process. From sanity checks though, I’m largely confident in the data.
  2. I created the dummy variables for each category, but didn’t use the dummy variables. Obviously, there are federal judges who fall into multiple categories. Given that many candidates clerked for other judges, worked in government, taught in academia, and privately practiced law before their final occupation coded in the “prior” variable, I didn’t want to over-emphasize those common backgrounds.  Judges, in particular, once on the bench, don’t tend to leave for another profession. Private practice isn’t really revealing unless it’s the profession an attorney held at the time of appointment. That said, I capture the dummy variables because they may offer some interesting cuts to the data. It may serve in a regression analysis, or offer a revealing look at whether public sector employees are really getting appointed at a lower rate, or if they’re just passing through another profession before entering the federal bench.
  3. Analyzing the demographic data would be great.
  4. The Federal Judiciary Center includes a "Judge Identification Number" that appears internal to their system, but it could potentially be a variable upon which I could merge other datasets. For instance, adding a database of nominees (rather than actual appointees), might offer other interesting insights.