About the Congressional Memory Project
Developed by Deb Kumar Roy
Internet Multicasting Service
Overview
This is an experimental text and audio server which enables access to
the proceedings of the U.S. House of Representatives. There are three
methods for accessing the archives:
- Search the archives constrained by speaker, date, time, and keywords
- Browse a complete day's archives
- Jump directly to a specific date and time in the audio archives
We have developed a custom speech processing system which attempts to
align text congessional records with corresponding audio based on
voice analysis.
The Databases
For each day that either houses of congress are in session we archive
both the text transcript of the proceedings (which is manually
transcribed), and a digital audio recording of the entire proceedings.
Currenly the server only supports proceedings of the U.S. House of
Representatives.
On a typical day in which the House of Representatives is in session,
the text transcript is about 15,000 lines, and there is about 10 hours
of audio.
Audio to Text Alignment
The Audio and text are aligned using automatic speaker identification
(speaker ID). The steps involved in performing speaker ID on the
House audio archives are:
- Manually find and collect two 30-second samples of recorded speech
from each member of the House. We currently have models for 357 of
the 435 members, but these members do over 95% percent of the talking
in the archives.
- Extract the energy and cepstral coefficients from each sample of
audio.
- Generate Gaussian models (mean and covariance matrices) of each
sequence of cepstral coeffients.
- Parse the congressional record for each day and extract the
sequence of speakers who spoke.
- For each day, use a Viterbi search to align the Gaussian speaker
models of each speaker to the audio archives. The speaker sequence is
used to constrain the Viterbi search. The Viterbi search generates an
index database which correlates speaker turns in the audio and text.