Wednesday, October 1, 2014

Which countries have online genealogy records?

Sometimes it helps to get an overview of what is and what is not available online in the way of genealogical records. If you live in the United States or the United Kingdom, you are swimming in online records. However, as genealogists we are realistic enough to realize that there are still huge numbers of records left to digitize. But what about those records available for other countries? How do they stack up?

Answering this question can be complicated because there may be local online sources in the local language that don't show up in any of my Google searches. It is easy for genealogists to become fixed on a particular area of the world because because of our ancestral connections. For example, if I have no ancestors from Germany then why would I be concerned about the number of records from Germany? On the other hand, those of us who spend a great deal of time teaching and assisting others in their research to find the necessity of knowing what records are available in a huge variety of countries.

It is also interesting to note the progress of adding worldwide genealogical records to the online databases. If you examine the record lists, sometimes called collections, from each of the major online genealogical database programs, you should get a good idea of the availability of records from any given country. I thought it would be interesting to choose some examples. Obviously listing every country in the world would be way beyond the scope of the blog post.

For this particular example, I decided to look at,, and For this illustration, I decided to choose the same countries from each of the large databases. Here are my findings from examining each programs. Each website has a list of countries with a number following. In each case, I found it necessary to explain what I thought was the comparable method of numbering records.

 One complication of trying to determine these numbers is that the countries have a tendency to have changed boundaries and identities over the years. For example, do I rely on the number of records available for "Germany" when all of the various countries that have been formed over the years and called Germany should probably be included? Do I include the Austro-Hungarian Empire in Germany? Nevertheless, choosing some of the countries can illustrate the relative availability of records.  In addition, comparing and to some extent with the other two databases is somewhat unrealistic. Each of the large databases have their own unique records in the number of records do not specifically relate to any particular users experience. has a total of 1837 collections. There is no practical way of determining the number of records because the column that lists record numbers records rather than the total number of records in any particular collection. However recently, has implemented a feature linked to the map on the records page that gives the number of records both indexed and unindexed.

  • United States, indexed records 3,161,326,075, record images 465,222,043 in 845 collections.
  • England, indexed records 687,347,166, record images 252,856,175, in 79 collections.
  • Denmark,Indexed records 14,561,774, record images 14,070,366, in 7 collections.
  • Russia, indexed records 10,389,774, record images 18,540,881, in 16 collections.
  • Argentina, indexed records 20,057,795, record images 13,606,189, in 29 collections.
  • Ghana, indexed records 243,908, record images 1,177,156, in 2 collections.
  • New Zealand, indexed records 16,465,909, record images of 12,056,202, in 5 collections. also gives a number of collections. The total number of collections presently listed is 32,409. The number of records in each collection is listed in the Card Catalog. However, ascertaining the number of records in any given area or country would require adding up each of the numbers of records from each of the collections in the list.

  • United States, collections 25,693, number of records; way over 1 billion.
  • England, collections 1428, number of records; in way over 1 billion.
  • Denmark, collections 56, number of records; probably less than 100 million if you exclude public member photos and scan documents, public member stories and private member photos and stories.
  • Russia, collections 36, number of records; probably around 2 to 3 million.
  • Argentina, collections 9, number of records; around 20 million.
  • Ghana, collections 3, number of records; less than 300,000 if you exclude records from that are not specifically identified to Ghana.
  • New Zealand, collections 41, number of records; more than 30 million. also has a method of searching records by place. The exact number of records must also be approximate.

  • United States,well over 100 collections with billions of records.
  • England, 10 collections with over 300 million records.
  • Denmark, 4 collections with approximately 5,000,000 records.
  • Russia, 4 collections with more than 1 million records.
  • Argentina, 6 collections with approximately 9,000,000 records.
  • Ghana, no records.
  • New Zealand, no specific records.

There does not appear to be an easy way to determine the number of collections or records in any particular geographic area. But, it is easy to determine that most of their records come from the United States, United Kingdom, Australia, New Zealand, and Ireland. There appear to be no records from Denmark, only one set of records from Russia, no records from Argentina, and no records from Ghana. As I stated above, there really is no way to compare the number of records on with the other websites because of this specialized nature of the entire database.

It is interesting to speculate whether or not the drop-off in records relates directly to the actual existence of records or merely the availability of digitized records to this point in time. I would suspect that many countries that appear to have limited records actually have very detailed records that have yet to be digitized. It will be interesting to see what happens in the future.

Tuesday, September 30, 2014

Documents from the Historical Society of Pennsylvania Digitized by FamilySearch

In a blog post of 29 September 2014, Paul Nauta of FamilySearch explained the new agreement between FamilySearch and the Historical Society of Pennsylvania. Quoting from the post entitled, "FamilySearch and Historical Society of Pennsylvania to Publish Historical Documents Online,"
The Historical Society of Pennsylvania (HSP; online at, one of the largest and most comprehensive genealogical centers in the nation, and FamilySearch (online at, a nonprofit premier family history and records preservation organization, announced a joint initiative to digitally preserve select collections of the historical society’s vast holdings, starting with compiled family histories. The project is now underway, and the digitized documents will be accessible for free at
 I was interested in this development because a significant number of my ancestors came from or through Pennsylvania. My Great-great-great-grandfather William Linton was originally buried in Philadelphia and then moved to the Westminster Cemetery in Bala Cynwyd, Pennsylvania.

The post goes on to tell about the Historical Society and the records:
Founded in 1824 in Philadelphia, the Historical Society of Pennsylvania is one of the oldest historical societies in the United States. It is home to some 600,000 printed items and more than 21 million historical manuscripts and graphical items. Its unparalleled collections encompass more than 350 years of America’s history—from the 17th-century to the contributions of its most recent immigrants. 
The initiative will digitally preserve and publish online the society’s many genealogies and local histories, family trees, and related family documents and manuscripts that contribute to the understanding of many family histories. Collections of particular interest might be those of Pennsylvania’s founding families, including William Penn and others. 
Some of the society’s holdings date back to before the Revolutionary War. The rare histories include family papers, cataloged photographs, genealogies, African-American collections such as a history of the Dutrieuille family and related families, a cookbook compiled by Ellen Emlen during the Civil War in 1865, Jewish resources, sources about daily lives in the history of the United States, and much more.

Sunday, September 28, 2014

The Handwriting Challenge

For many years I was a partner with my father as we practiced law. My father preferred to write out all of his legal briefs by hand. This would not have been a problem except that his handwriting was indecipherable much of the time. It was common for his secretary to come to me for an attempt at the translation of something he had written. In later years we tried to get him to use a computer, but he was never comfortable with composing legal briefs on the computer and even then the print-outs would be covered with his handwritten revisions.

When I became interested in genealogy, I soon encountered my Great-grandmother's handwriting. Mary Ann Linton Morgan (b. 1865, d. 1951) had very good handwriting and I soon got extremely good at reading everything she wrote. Here is a sample:

Just like with my father, if I wanted to know what my Great-grandmother had to say, I had to learn to read her handwriting. I was wondering what the attitude of my new state of Utah was towards implementing teaching cursive in the schools. I found the current policy on the website for the Utah State Office of Education. Here is what they had to say:
Utah studied cursive writing during the 2012-2013 school year. A committee of classroom teachers, university faculty, and literacy specialists met to look at the relevant research and data. This committee created language for the Utah Core Standards that was presented to the Utah State Board of Education (USBE) in April, 2013. Public comment was requested during April and May, 2013, and a summary of the comments was presented to the Board on June 7, 2013. 
The State School Board voted to approve the additions to the Utah Core Standards that include teaching manuscript and cursive writing and also include building fluency in reading cursive writing. 
Handwriting (both manuscript and cursive) is an important skill for students to learn. Teaching and practicing writing allows students to write letters correctly and efficiently. Fluent writers are able to focus on generating idea, producing grammatically correct text, and considering audience. Even when a student moves to a computer or other device, that writing fluency is important to the composing process.
Compared to the attitude towards teaching cursive in many other areas, this shows a very positive position. It is very obvious to anyone beginning research in genealogical sources that one of the real challenges is deciphering handwriting. The further we go back in time, the more difficult the challenge in reading the handwriting.
Assuming that you can read cursive at all, this example from the 19th Century is fairly easy. As we step back in time, we begin to see some changes. This is evident with handwriting from the 18th Century. Here is an example:

George Bickham's Round Hand script, from The Universal Penman, c. 1740–1741.

Of course, my examples are from English-based writing. There are a whole different set of challenges if the manuscript is in a non-English language. Quoting from Wikipedia:George Bickham the Elder:
George Bickham the Elder (1684–1758) was an English writing master and engraver. He is best known for his engraving work in The Universal Penman, a collection of writing exemplars which helped to popularise the English Round Hand script in the 18th century.
Bickham and others popularized a style of writing called Roundhand. In the 17th and 18th Centuries, this form of writing spread across Europe and into America. A good example of the influence of Roundhand in America is the U.S. Declaration of Independence which is believed to have been written by Timothy Matlack. See History of penmanship.

As we move back into the 16th Century, we find that handwriting is becoming less and less recognizable from our "modern" perspective. Here is an example:

Script type based on the hand of its cutter, Robert Granjon
Of course, these examples are done by professionals. Here is an example from the will of William Shakespeare written in 1616 in secretary hand, a very difficult script for modern eyes.

William Shakespeare and unknown scribe
This particular example dating from 1557, from the French punch-cutter (type maker) Robert Granjon reflects the everyday handwriting of Europe at the time. As we go back another 100 years to the 15th Century, the handwriting becomes more and more difficult to read without intense concentration and study. Here is an example from 1412:

Geoffrey Chaucer by Thomas Hoccleve (1412)

We could keep going back in time indefinitely. When we get back into the 14th Century and even further, we find "Blackletter" which is also known as Gothic script. Here is an example from 15th Century:

Calligraphy in a Latin Bible of AD 1407 on display in Malmesbury Abbey, Wiltshire, England.
When you get back to the 13th Century, you encounter Carolingian minuscule,  a script developed as a calligraphic standard in Europe so that the Latin alphabet could be easily recognized by the literate class from one region to another. It was used in the Holy Roman Empire between approximately 800 and 1200.

Carolingian minuscule
By this early date, we have left all of the modern languages behind. We have even left Middle English behind. Just in case you forgot your Middle English, here is a sample from the Canterbury Tales:
Whan that aprill with his shoures soote
The droghte of march hath perced to the roote,
And bathed every veyne in swich licour
Of which vertu engendred is the flour;
Whan zephirus eek with his sweete breeth
Inspired hath in every holt and heeth
 And if you are still not convinced, here is an example of an Old English text with a transliteration:

[I thank the almighty Creator with all my heart that he has granted to me, a sinful one, that I have, in praise and worship of him, revealed these two books to the unlearned English nation; the learned have no need of these books because their own learning can suffice for them.]

Of course I have an ulterior motive for giving all these examples. In the last two related posts and with this one, I am illustrating the effort that is needed for a genealogist to do research before 1500 A.D. I am reasonably certain that most of those genealogists who brag about how far their pedigrees extend into the past have never dreamed about reading any of these old scripts. Copying an old pedigree out of a book or from a website is not doing genealogy. It is nothing more or less than fiction writ bold in an online family tree. If you had the knowledge to read these old documents, you wouldn't be stupid enough to believe them as a basis for a pedigree.

Is Genealogy Big Data?

"Big Data" is a new jargon term for a computer programing and technology approaches to massive amounts of information. Here is one definition of "Big Data" from Wikipedia:
Big data is an all-encompassing term for any collection of data sets so large and complex that it becomes difficult to process using traditional data processing applications. 
The challenges include analysis, capture, curation, search, sharing, storage, transfer, visualization, and privacy violations. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data, allowing correlations to be found to "spot business trends, prevent diseases, combat crime and so on.
As genealogists, we like to think that we are good at what we do; finding ancestors. In fact, this process, we call genealogical research, of finding, evaluating and recording the information found about our ancestors could be done by computer programs. Currently available programs from the large online database programs such as and and closely followed by, have already demonstrated that computer programs can find sources more efficiently and at least as accurately as any human researcher. In effect, they are tackling the issues of "Big Data" as they apply to genealogy. 

The main obstacle to computer programs completely overtaking humans is in deciphering handwritten documents. But once the "indexing" process is done, the computers can and will take over.

Human researchers, when confronted with all of the genealogical information online, point out that the online programs "don't have all of the sources." I would add the word "yet." Even though significant amounts of genealogical data are locked up in paper (and other media) around the world, that situation is changing rapidly. 

We only have to look into the future a short time to see that the domination of paper will change. There are presently more than 7.2 billion people on the earth. The percentage of Internet usage varies from over 96% in countries such as Iceland, to somewhat lower numbers in developing countries. The total number of Internet users worldwide is over 2.8 billion. If you think of the average family size, you can see we are almost at the saturation point. 

What this means is that in the future, there will be no need for "research" about families. All of the family history data about each one of us globally will, for all intents and purposes, be readily available. Right now, if the average genealogist who is in a developed country or whose ancestors were in a developed country, signs up for or, they can expect the programs to find source records to automatically build a pedigree back two or three and perhaps, four generations. All the user has to do is confirm the matching record hints. If you think about what will happen two or more generations into the future, you will see that our descendants will automatically have four and five or many more generations provided to them by the computer programs. 

At the same time, these huge databases will keep gobbling up the records of the past at a huge rate. As traditional genealogists, we are fixated on the human-based, purely mechanical process of gathering family history data. We talk endlessly about reasonably exhaustive searches and proof and other issues. All the time, these issues are vanishing right before our eyes. Much of what current genealogists do is duplicate research that has already been done in the past. To the extent that computer systems allow us to avoid this duplication, our efforts can be directed to those areas that really need research. 

Let's suppose that, for an example, solves the problems of duplication in the Family Tree program. Let's further suppose that the partnership with, and is successful in opening the FamilySearch records to millions of more users. Let's further suppose that FamilySearch finishes digitizing all of its existing records going back to 1938. Let's also suppose that the rate of digitization of records increases with attention being made to smaller and smaller collections. In addition, let's suppose that a way for people who have collections of records in their possession is created where they can share those records online. Let's go even further and suppose that there are millions upon millions of people involved in this process, not just the tiny number involved today. 

Do you really think you can avoid this inevitable process? There will still be dead ends and brickwalls in the past. But they will be real missing data, not just a failure to do adequate research. Don't under-estimate the impact of this process. 

Saturday, September 27, 2014

Can I research back into the Middle Ages?

Oriel College CharterPublic Domain
I recently posted about some of the limitations of the older records in the online genealogical database programs. But there is another entirely different aspect to researching back into the dim past and that is the practicality of doing research into the the Middle Ages and, if possible, beyond. The Middle Ages are usually defined as beginning with the collapse of the western Roman Empire around the 5th Century and as ending with the Renaissance and the Age of Discovery in the 15th Century. This particular characterization of history belongs exclusively to the countries in Western Europe. So if your ancestors came from Asia, Africa, North or South America, it would be a good idea to study the history of your own place of origin.

From a genealogical standpoint, our main interest is the existence of records of people's lives during any part of the world's history. Interest in genealogy is not confined to the descendants of Western European Ancestors and the existence of very old genealogical records in any area other than Western Europe depends on some of the same factors that determine the existence of records from that time period in Europe.

Gutenberg Bible Genesis
What are realities of doing genealogical research before 1500 A.D? As I have mentioned previously, the first and possibly most important historical fact is that printing with movable type did not begin until after 1439, so any books or manuscripts before that date (and for a considerable time afterwards) were written by hand. The number of books that have ever been published is nearly 130 million. If we go back to the year 1500, we get into what is called the incunabula, that is books printed before the year 1501 in Europe. See Wikipedia: Incunable. The total number of books in this category are just over 22,000. The year 1501 is significant because that is date that Aldus Manutius published his first book in italic type.

So by moving back into the Middle Ages with our genealogical research we are leaving the world of printed books behind and entering the world of manuscripts (from the Latin manu "by hand" and scriptus "written").

Augustine Gospels Full-page miniature of St. Luke as an evangelist, 6th century. 
For information about how manuscripts were made, see Manuscript Basics from the Free Library of Pennsylvania. Here are some additional websites that talk about medieval studies:

The total number of medieval manuscripts in existence is subject to some conjecture, but of course, there are no new manuscripts being made, so the number is related to those in libraries, museums, privately held, etc. For a detailed analysis of the number of manuscripts in existence, see:

Buringh, Eltjo. On Medieval Manuscript Production in the Latin West: Explorations with a Global Database. Leiden [u.a.]: Brill, 2011.

Buringh estimates the total number at around 2.9 million but this number includes every handwritten manuscript including those from Egypt, sheet music, autograph letters, deeds, charters and other types of documents from around the world. 

OK, now what are the challenges? How many of these manuscripts would have genealogical data? The difficulty in determining this number is that the libraries do not break out records by year but only by catalog entry. 

Family tree of theOttonians, from the Wolfenb├╝ttel manuscript (Cod. Guelf. 74.3 Aug., p. 226)
To get some idea, you might want to look at the Foundation for Medieval Genealogy website. Many of manuscripts are protected by "copyright" claims from the library or other institution. I find it difficult to imagine how anyone can claim a copyright to a book published in the 1300s, but that is what they do. Here is an image that is in the public domain:

Genealogy of the Kings Of England, In A Collection Of Chronicles Of English History And Miscellaneous Tracts
Even assuming that you could work out the details of gaining access to any substantial number of these medieval manuscripts, you would still have all of the challenges I outlined in my previous recent blog post. See What are the sources before 1550 A.D.?

The Dark Tunnel of Time

Genealogists are great believers in uniformitarianism. Unfortunately, assumptions that conditions in the past were the same as they are today may have once applied to the study of geology, but it is fatally inappropriate for genealogists. As a methodology, assuming that conditions today were the same as those in the past leads researchers into dead ends and research traps. How is this belief in uniformitarianism manifested? Usually as a woeful lack of knowledge about history and particularly, the history of the area where the genealogist is researching.

Perhaps a definition of uniformitarianism would be helpful. It was first defined back in the 1830s by Charles Lyell, a geologist, in a multi-volume work:

Lyell, Charles. Principles of Geology: Being an Inquiry How Far the Former Changes of the Earth's Surface Are Referable to Causes Now in Operation. 1834.

A current definition of uniformitarianism comes from Wikipedia: Uniformitarianism:
Uniformitarianism is the assumption that the same natural laws and processes that operate in the universe now have always operated in the universe in the past and apply everywhere in the universe.
This sounds harmless enough, but genealogists are not dealing with rock strata. They are dealing with people whose technological, cultural, social and political lives change dramatically through time. These changes include naming patterns, family structures, laws, political boundaries, social organizations, governments and almost every other aspect of history that I could possibly think to include. By the way, geologists have dramatically altered their way of viewing the past since the 1800s and uniformitarianism, as proposed by Lyell, has been extensively modified.

Let me give a trivial and very simple example of what I am talking about. It is the practice of recording place names. I wrote about this recently when I pointed out one of my ancestors purportedly born in South Kingston, Washington, Rhode Island, United States in 1630. I could give hundreds of examples of this same issue. Of course, this particular example is only the beginning.

In the title to this post, I refer to "the dark tunnel of time." For many genealogists, peering into the past is like exploring a vast cavern or tunnel with a weak and faulty flashlight. They see only small glimpses of what is waiting just outside of their weak beam of light and when they record what they see, the view is so narrow as to be totally useless at best and misleading at worst. In these dark and dismal conditions, the genealogist reverts to imposing what he or she knows about today on the past; of course, if we have a city, county, state and country today, those conditions must have always existed.

Here are some concrete and illuminating suggestions for making your way out of the dark tunnel:

  1. Take time to read a good general history of each country where your ancestors lived. 
  2. Take time to read a good general history about the states, provinces and other areas where they lived.
  3. Extend this investigation down to the local level. Read books on the history of the places where your ancestors lived.

If the person who added the Rhode Island entry had stopped, even for a very short time, and thought about the reality of the situation, perhaps the entry would not have existed. But then again, I find the same problems of understanding and reporting the past as if it were the present every time I examine anyone's online family tree.

Friday, September 26, 2014

Building a Bridge in the Middle of the Air

Frequently, someone will approach me for assistance in finding a remote ancestor. I have a standard set of questions that I ask to clarifying the research objective. I generally ask when the ancestor lived and exactly where they lived. I usually pursue the questions by briefly verifying the time period involved and the location of an identifiable event. More often than you would expect, the person requesting assistance has absolutely no idea about the identity of the distant relative or any of the distant relatives descendants. For example, if the person were seeking information about a distant great-great-great-grandfather, I will continue asking the same questions about each of the descendants of the remote ancestor until the person requesting assistance can provide some concrete information.

Usually, the person's research objective has been selected merely on the basis of a missing ancestor in a family tree; usually in the form of the fan chart. I liken this situation to attempting to build a bridge by starting in the middle of the air before finishing the supports for each side of the structure. Almost always, the missing ancestor is missing because of a lack of adequate research concerning the ancestors descendants. In some cases, after questioning people about each generation of their ancestry, I have found that the only ancestors that they know anything about are their own parents.

Commonly, the inquiry about the remote ancestor relies entirely upon unreliable and unverified information in the generations leading up to the target ancestor. The enticement of an empty location on a pedigree chart is sometimes overwhelming. Some researchers, particularly new researchers, thinks that "doing their genealogy" involves extending a pedigree before spending any time becoming acquainted with the intervening ancestors. Frequently, the places and dates are approximate. In addition, I find incomplete names, approximate dates and places not fully specified.

There is a delicate balance between encouraging the researcher and throwing cold water on the whole project. Sometimes the potential investigator fails to grasp the significance of the lack of supporting data in his or her pedigree. As a result, sometimes we part ways on a less than satisfactory basis. I am certain that the person, under these circumstances, believes that I did not understand what they were trying to accomplish. Likewise, I am certain that they did not understand what I was saying. Additionally, they did not realize the significance or the importance of establishing all the facts leading up to the missing information.

It is not difficult to determine in any pedigree the point at which the information available as become entirely speculative rather than based upon concrete, supported information. There is always a point in which dates become approximate and the places lose their specificity. To the degree that the information is either speculative or incomplete, subsequent generations back any support.

I constantly advise, even experienced genealogists, to examine their data and avoid the problem of trying to build a pedigree in the air.