Wednesday, July 11, 2012

Where Is My Free Genealogy Data?

One of the things I like (and respect) about Thomas MacEntee is that he really puts himself out there, and his blog post today is another fine example as he revisited the question that I have heard at both small and large conferences, "Where Is My Free Genealogy?His post talks mostly to the service side of the industry (speaking, researching, etc), and so I wanted to briefly highlight some of the issues around making genealogy data accessible.

The following are the key components behind the "cost" of genealogy data: (1) acquiring the materials, (2) digitizing and transcribing it, (3) hosting it somewhere, and (4) providing search capabilities to mine through it all, and (5) achieving a higher level of quality and source-ability. And the more data you try to make available, the higher the costs in each of these areas.

Acquiring the Materials: For the most part, genealogical information is a plentiful resource, with treasure troves of data tucked away in libraries, churches, and local societies all over the country. Some resources, like cemetery tombstones, are simply sitting out there in the open. It's these smaller, more accessible items that are often posted online for free, typically at the generous effort of someone who volunteered their time to seek it out.

But, genealogists know the real value is in the larger collections, most of which have been microfilmed or remain locked away at state/national archives. Prior to the indexing efforts of FamilySearch, there were few if any large collections online for free. And don't be fooled into thinking that the stuff FamilySearch is posting online is "free"... it costs LOTS of money. We're just fortunate that they are absorbing all of those costs for the mutual benefit of the industry and their church members.

With my own project, for years I've been purchasing actual copies of the original documents that source the information in my database. Some of these items were produced in very low quantities and there are few remaining copies. Others are handwritten, one-of-a-kind, originals. I've spent an enormous amount of money putting together this collection.

Digitizing and Transcribing: While technology continues to improve, these two critical steps are very costly and time-consuming. Most of the larger companies delegate the work to offshore labor farms, where the costs are significantly lower. Even much of the online information you enjoy using was "Made in China," or some other country.

Whether a company is using offshore staff, or handling the process with our own citizens, the people doing the work deserve to be compensated for their time, and the costs add up. Think about this... whenever you visit your doctor or consult an attorney, a portion of the fee you pay them goes towards transcribing billing, insurance and medical information. The people that do that work get paid, so why do genealogists that the people transcribing genealogy data shouldn't?

Hosting the Data: Most genealogists that I've talked with (about this issue) have no clue as to just how expensive it is to host information online. They've simply seen too many examples (e.g. RootsWeb, etc.) where hosting pages of content is free or relatively inexpensive. But, that's not the type of service required to host large volumes of data and images. 

When Inc. acquired RootsWeb, they immediately felt the cost impact, which led them to place advertisements upon pages of free information. So, while the information remains "free" to use, we're forced to endure advertising and offers to join their service.

Adding Search Capabilities: Genealogy is not a simple process, and as a collection of information grows, the tools needed to search it effectively and efficiently become a costly challenge. You need teams of Programmers to create the tools, Database Administrators  to optimize the searches, and Designers to create productive  user experiences. These staffing requirements are not cheap.

Quality and Source-Ability: Prior to the major indexing efforts of, there were few projects that delivered free information with a high level of accuracy AND more importantly source-ability. One of my pet-peeves with a lot of free information posted on the Internet is that lacks clear source information, making it a challenge to utilize in your own research (if you are particular about that kind of thing). But, roll back the clock a few years, and even the industry leader, Inc., did only a mediocre job on this point.

When I decided back in 2003 to enter the genealogy data fray, data quality and accurate sourcing were two of the top priorities. I didn't always get it right, but I've continued to improve and expand in both of these areas. We get "excited" to see the information, but it's equally (or more) important to be able to identify where it originally came from, otherwise we have no way of verifying it's accuracy!

So, as you can see there are a lot of steps in the process from getting information from a piece of paper or microfilm to a searchable online database you can access from the comfort of your own home (or local library). It's great that FamilySearch is willing to commit millions of dollars to making their collection freely accessible, let's hope they are able to continue to do so for years to come.

But for the rest of the companies, and hard-working people who have chosen "genealogy" as a profession, the customer will most likely always be expected to help pay for these costs. And Thomas points out, genealogy services are very undervalued compared to other industries.

Being a small player in the genealogy industry, I am VERY appreciative of those researchers that support my project financially. But, it gets frustrating week after week trying to explain to those people who feel the need to complain (and some even with vulgar language) about my annual subscription which nets out to $0.09 per day. I keep asking myself, how is that so terribly unaffordable? And why is it necessary to be hostile about it?

Try adding up what you pay annually for your cellular phone or cable television and then calculate how much that costs per day... now that's something to be hostile about!

Thanks, Thomas, for reminding genealogists that the people working to make their research process easier and more fruitful "deserve" to be fairly (not barely) compensated.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.