![]()
LipTracker™ is
a non-invasive measurement tool for in-service lip sync analysis. It operates
using the same principle as the human brain by comparing the timing of
sounds in the audio and mouth shapes in the video to measure the lip sync
error. LipTracker™ improves productivity by supplementing
an operator's subjective and time consuming analysis of lip sync with
rapid objective results measured in real time from the program material.
This easy to use measurement tool provides numeric and graphic displays
of the lip sync error, a history graph, status indicators and event logging.
LipTracker™ increases efficiency in systems design
and installation, daily operation and program quality assurance.
Audio offsets of up to ± 5 video frames can be measured in standard
mode or up to ± 20 frames in extended range mode. This unique approach
of analyzing real time video and audio content does not require the insertion
of cues, codes or watermarks into the program stream. Therefore, since
the program material is untouched, LipTracker™
can be used at any point in the transmission path.
Depending on the program content, the first result can be displayed in
as little as 4 seconds after a face is detected. The result is then updated
every 2 seconds until the current face is lost or a new face is detected.
The history graph charts the most recent error profile and event logging
saves the results for scene by scene analysis. The Audio Offset Status
indicator is a visual warning of the current offset. User programmable
thresholds determine whether the indicator is Green, Yellow or Red at
any given offset reading.
Features
Non-invasive analysis of lip sync errors up to ± 20 video frames
by comparing video and audio Mutual Events (MuEvs)
Language independent
Displays current lip sync error with numeric and graphic displays
First result displayed in as little as 4 seconds and updated every 2 seconds
Measurement offset parameter is used to compensate for known fixed delays
in the video or audio being analyzed
Automatic face detection with point and click manual override for scenes
with multiple faces
Audio Offset Status indicator provides a visual warning of the current
offset
History display shows most recent error profile
Event logging for scene by scene analysis and archiving
Operates with SD or HD SDI video and AES-3id audio or audio is internally
de-embedded from the SDI input
Digital and analog video and audio monitoring outputs
Face Detection
LipTracker™ searches frame by frame for a face
in the input video. After finding a face, LipTracker™
automatically locks onto it and maintains lock during typical camera pans,
tilts, zooms, and through the normal range of head motion. Minimum face
height (from the top of the head to the bottom of the chin) is one quarter
of the overall picture height.
In scenes with multiple faces, if LipTracker™ selects
a non-speaking face for analysis, you can override the face selection
by pointing to the correct face with the cursor and double clicking the
mouse.
Determining The MuEv Offset
The sounds and mouth shapes that are used for MuEv analysis are commonly
found in the natural speech patterns of many languages. When a face is
detected, the input video is processed by locating the upper and lower
lips within the face and extracting the mouth shape characteristics to
generate a field by field stream of video MuEvs.
LipTracker™ does not need to be "trained"
in advance to recognize any particular voice. The input audio is normalized
and processed with LipTracker™'s proprietary technology
to generate a stream of audio MuEvs that are speaker independent.
The audio and video MuEv streams are then correlated to determine the
measurement of the lip sync error that is displayed on the screen. The
silence segments that occur in
the audio input are also identified to provide additional cues for measuring
the lip sync error.
Event Logging
LipTracker™'s results can be archived for scene
by scene analysis. When logging is enabled, the audio offset measurements
are written to an HTML file and/or a comma delimited (.csv) file. The
.csv files can be imported into a spreadsheet or other application for
further analysis.
For each program segment that is analyzed, a thumbnail of the first frame
is stored along with the segment start time, the time of each measurement
and the audio offset at that
time. The system clock and/or VITC from the video input signal can be
selected as the logfile time reference.
Longitudinal time code (LTC) can also be recorded in the log files via
LipTracker™'s 9-pin serial port. An external converter
is required to translate baseband LTC to Sony™ serial protocol.
Language Independence
LipTracker™ analysis uses a number of key
sounds that have the same distinct mouth shapes in virtually all languages.
Examples include the EE sound (street in English, Paris in
French), the OO sound (moon in English, fruta in Spanish) and the AA sound
(palm in English, nacht in German). Therefore, LipTracker™
is not limited to operating with English speakers but is langauge independent.
Scene Change Processing
In the normal mode of operation, analysis automatically restarts
when a new face is detected. Each newly detected face is assumed to come
from a different source and therefore could have a different audio offset
than the face that preceded it.
However, for those applications where consecutive scenes are known to
have the same audio offset, the "continuous" mode of scene change
processing can be used. In this mode, the video and audio MuEv streams
from consecutive scenes are combined and averaged to produce the audio
offset results. This mode is also useful when the individual scenes are
too short to generate measurements.
Measurement Offset
The LipTracker measurement window can be offset by up to ±
5 video frames in half frame increments. This offset parameter is used
when there is a known fixed delay in either the video path or the audio
path feeding LipTracker™.
For example, an HD to SD downconverter will add delay to the video path,
or a digital audio processor can add delay to the audio path. Using the
appropriate value of measurement offset ensures that LipTracker™
operates in the center of its measurement range.
Measurement Response Time
LipTracker™ provides two modes of measurement
response time - Normal and Fast. Normal mode is appropriate for most applications
where the audio offset does not change significantly during a single speaker
segment. Fast mode is used when the audio offset may have significant
changes that occur frequently during single speaker segments.
| Specifications |
| LipTracker Configuration |
| Each LipTracker™ 1RU frame is shipped with a breakout cable (see below) and keyboard and mouse. Simply add a standard XGA monitor for a fully operational system. |
| Digital I/O | |
| Video Input (SD mode): Input Formats (SD mode): Video Input (HD mode): Input Formats (HD mode): Input Connector: Audio Input: Input Format: Input Connector: Embedded Audio: Video Monitoring Outputs: Output Connector: Audio Monitoring Output: Output Connector: |
Standard Definition SDI video (SMPTE 259M-C) |
| Analog Monitoring Output | ||
| Video Output: |
Selectable between: | |
| Composite NTSC or PAL; YC NTSC or PAL; Y, R-Y, B-Y (Betacam™ or SMPTE) |
||
| Output Connectors: Audio Output: Output Connectors: |
3 x BNC (75 1 balanced stereo pair 2 x XLR - breakout cable |
|
| LTC Input | |
| Input Format: Connector: |
Longitudinal time code can be used as a logging reference. An
external converter must be used to convert the baseband time code
to Sony™ serial format (RS-422). |
| Rear Panel Interface (option) |
| The breakout cable connections can be replaced with the optional
1RU Rear Panel Interface. (zoom in/out) ![]() |
LipTracker™ and Pixel Instruments are trademarks
of Pixel Instruments Corporation. Sony and Betacam are trademarks of Sony
Corporation.
Features and specifications subject to change without notice. U.S. Patent
Applications 20040227856, 20070153125, 20070153089 and other patents applied
for.
| Quick Links - Products |
| Downloads - PDF |
| Google Search |
| Downloads - Video |
| Quick Links |

