Top of page
Data For Exploration Data packages Digitized Telephone Directories, 1891-1988 Data Package
This dataset contains metadata records for a subset of 3,513 reels of US telephone directories, digitized from microfilm, from the Digitized Telephone Directories collection . Full text OCR files are also included for a subset of the records when they exist.
This experimental dataset was created as part of an LC Labs experiment in collaboration with AVP to understand the benefits, risks, quality benchmarks, workflows, compilation methods, transformations, and documentation practices required to assemble datasets for public use in the cloud.
Metadata | Metadata formats | Full text OCR files |
---|---|---|
3,511 records | .csv, .json | 486 .txt files |
Included in this data package is comprehensive documentation of source data or collection provenance, the contents of the data package, and how the data package was created. Here are some particular sections of interest as well as a link to the full documentation:
There are two main options for accessing and using this data package: (1) Directly downloading files from this page and (2) using Python for more advanced usage.
The following list outlines the contents of this data package. Many of the individual files inside the data package are linked directly on this page which you can download and immediately use. Zipped files are available for bulk download of the entire or parts of the data package.
Download everything |
|
---|---|
Sample the data |
|
Download the documentation |
|
Download the metadata |
|
Download the full text files |
|
While direct downloads are more convenient for most activities, users with familiarity with writing Python can perform more advanced and complex tasks programmatically.
For your convenience we developed a number of Jupyter Notebooks to help get you started.
View the Python notebook for this data package
For bulk downloads, refer to this Python script for downloading files in bulk . Sample commands for this data package:
Download all OCR'd text files in this package
python bulk_download.py --package "https://data.labs.loc.gov/telephone/"
--out "output/telephone/"
Source collection |
U.S. Telephone Directory Collection The Library of Congress makes available to the public an extensive collection of past and present city, telephone, and reverse telephone (criss-cross) directories for the United States and many foreign countries. These directories are available in a variety of formats and locations. The collection spans most of the 20th century, and includes directories from Alabama, Alaska, Arizona, Arkansas, California, Colorado, Connecticut, Delaware, the District of Columbia, Florida, Georgia, Hawaii, Iowa, Maryland, Pennsylvania, and the city of Chicago. All the directories and their metadata records are in English. There is not a one-to-one correspondence between metadata records and directories, as some microfilm reels contained multiple directories when they were digitized. There is a mix of white pages and yellow pages. Some directories may be missing pages due to damage. |
---|---|
Rights statement | All white pages are in the public domain, as are any pre-1964 yellow pages that were not registered and renewed for copyright. For more information, see https://www.loc.gov/collections/united-states-telephone-directory-collection/about-this-collection/rights-and-access/ . |
Date created | 2023-05-05 |
Date updated | 2024-03-29 |
Creators & contributors |
|
Cite this dataset |
|
Curatorial questions | For curatorial questions about the content of the collection or technical questions about the dataset formats and composition, please contact the History and Genealogy Section via the Library's Ask a Librarian service at https://ask.loc.gov/genealogy-local-history/ . |
Access questions | For questions and technical issues about download and access, please submit a ticket on Github or email the LC Labs Team at [email protected] . |