Closed Caption Decoder Theory of Operation 29-DEC-2003 Copyright 1995, 1996, 2003 Eric Smith and Richard Ottosen http://www.brouhaha.com/~eric/pic/caption.html What is a closed caption decoder? --------------------------------- Closed captioning is the encoding of textual information in line 21 of the vertical blanking interval of a video signal. The primary purpose is to make the program accessible to the hearing-impaired. Closed captioning is now available on many televison broadcasts, video cassettes, and laser video discs. The captions may be prerecorded with a scripted program or added live as with a news broadcast. Use of the captioning requires a special decoder. Normal closed caption decoders take this hidden information and display it as an overlay on the video image (like subtitles). In the past this decoding was usually done by a special add-on unit, but now it is often done by electronics built into the televison receiver. How to use the PIC closed caption decoder ----------------------------------------- This closed caption decoder accepts a baseband (i.e., composite) video input (typically from the video output of a VCR), and outputs the decoded caption information to a serial port. The serial output may then be fed to a serial port on a PC and captured with a terminal emulation program. Since the VCR records line 21, the decoder works both with off the air and recorded programs. An LED lights and the EIA-232 carrier detect signal is asserted when a valid closed caption signal is detected. Background ---------- _ _ _ _ _ _____ _____ __ / \ / \ / \ / \ / \ | | | | | ___ _____/ \_/ \_/ \_/ \_/ \______| |_____| |_____| \___/ sync tip |<------ run-in------------>| |start| | bit Figure 1: Line 21 waveform (not to scale) Hardware -------- Power supply ------------ U1, an LM2931-5 voltage regulator, takes the nominal +9 volts and regulates it down to +5 volts. Use a 9 volt DC wall transformer for line powered operation. This regulator is designed to withstand reversed input voltage. C1 must be nonpolarized to also handle a reversed polarity input voltage. Alternatively, a 78L05 type regulator could be used by adding a 1N4001 diode in series with the positive battery terminal (anode to battery) to protect the regulator and circuit from a momentarily reversed battery. The power requirement is less than 50 mA at +9 volts: Ref Number Typical Maximum ----------------------------------------------- U1 LM78L05 2.5ma 5.0ma $$$ replace w/ LM2931AZ-5.0 U2 PIC16F628A 3.1ma 3.4ma U3 Intersil EL4581C 1.7ma 3.0ma U4 TLC272 1.9ma 4.0ma Q1 2N5089 0.3ma 0.6ma Q2 2N3904 $$$ added D4 LED 3.2ma 3.2ma R10 Pullup 2.5ma 5.0ma R14, R15 EIA-232 loads 1.2ma 2.4ma ------ ------ 16.4ma 26.6ma $$$ recompute Microprocessor -------------- The microprocessor is a Microchip PIC16F628A, which was chosen because it has two internal analog comparators which are useful to implement an adaptive data slicer. The PIC runs at an oscillator frequency of 20 MHz. This is slightly less than 40 times the 503 KHz run-in frequency of the closed caption signal. Sync Separator -------------- The EL4581 is an improved version of the LM1881 and observations show that the EL4581 performs much better than the LM1881 when less than ideal video signals are being received. Video is applied to the input of the EL4581 (U3) which strips off the composite sync for the micro to use. The CSYNC pin, which is low during the sync period, goes into PB0 (pin 6 of U2), where the falling edge generates an interrupt. The composite sync and odd field outputs of the EL4581 are wired to PB6 and PB7 of the PIC, and are polled by the interrupt handler. The video is filtered to reduce the effects of noise impulses on the video signal. This filtered video is applied to emitter follower Q1. Q1 obtains its bias (1.3 volts to 1.9 volts) from the video clamp at the EL4581 video input pin. Q1 is used to prevent DC restore pulses from feeding glitches back into the sync stripper. Port output PA2 (U2 pin 1) is pulsed low during the video blanking periods. This charges cap C11 to the difference between the AC coupled video blanking potential and ground. C11 will hold this voltage for several video lines keeping the blanking level at ground independant of the variations in average brightness of the video. R6 sources a small current to ensure that the PA2 clamp always pulls toward ground. After the DC restore pulse the video is left driving PA2 as an analog input. Data Slicer ----------- The data slicer takes the analog video signal and discriminates the caption data, producing a digital signal. This is generally done with a comparator and an adjustable threshold. If the threshold is set slightly high or low, the duty cycle of the pulses from the comparator will vary, making it difficult to reliably decode the data. Because the input signal level can vary considerably, a simple fixed threshold is not suitable. In fact, the signal level can vary between programs and commercials, or even between scenes in a single program, so even a manually adjustable threshold is unacceptable. The data slicer uses an adaptive slice level which is determined by the use of a peak detector, the the threshold automatically adapts to changing signal levels. The data slicer is implemented using the two internal analog comparators of the PIC16F628. The comparators are used in mode 6, so the outputs of both comparators are available on pins of the PIC. Comparator one is used as the peak detector. Port pin PA0 controls the peak detector to catch peaks when the closed caption data is present in the video. To catch the peaks of the closed caption, the peak detect capacitor is discharged just before the start of the run-in cycles. Comparator one then supplies the charging current for hold capacitor C12. Resistor R7 limits the charging current into C12. Diode D1 prevents discharge of the peak holding capacitor, by the comparator output, between peaks. Cap C12 is chosen small enough in value to be charged by the current from R7 and large enough in value to hold peaks while discharging through resistors R8 and R9. The negative peaks of the closed caption are transmitted at the same level as video blanking. With DC restore this is 0 volts. Resistors R8 and R9 split the difference between the postive peak held on C12 and ground to set the threshold for the data slicer (comparator two) centered between the peaks of the data voltage. The output of comparator two is available on pin PA4 of the PIC, an open-drain output with R10 is used as a pullup. This output is a TP7, a test point for the sliced data. The comparator two output is also available to the micro in MSB of the comparator control register (address 1Fh), allowing the software to read it quickly using a rotate instruction. During video line 21, the software reads the raw data stream into an array and then processes this array to find the character codes in the caption. There are five samples of raw data for each run-in cycle and also five samples per character data bit. EIA-232 Interface ---------------- The EIA-232 drivers are sections of a dual op-amp (U4) used as comparators. Resistors R12 and R13 bias the inputs of the amplifiers to half of the logic swing out of the PIC. The outputs of U4 swing between about -4 volts and +5 volts. The drivers inverting. R14 and R15 give some protection against reversed transmit/receive EIA-232 signals as well as short circuits. The software sets serial communications to 19200 Baud, 8 bits, no parity and 1 stop bit. Active closed captioning is indicated to the computer by driving the EIA-232 DCD (DataCarrier Detect) line positive. $$$ rewrite this paragraph An EIA-232 input is used to control the operation of the decoder. The serial input comes in through a series resistor (R16) used to protect the input of the micro. A pull down resistor (R17) prevents the input from floating when the EIA-232 cable is not connected. Charge Pump ----------- The -4 volts is created by a charge pump driven by output pin PB3, using the PWM mode of timer 2. When PB3 is high, capacitor C13 is charged through D3 to +5 volts on the PB3 side and +0.6 volts on the other side. PB3 going low forces the left side of C13 to ground and therefore its right side to -4.4 volts. This forward biases diode D2 and delivers -3.8 volts to charge C14. To maintain a constant DC voltage on C14, PB3 must constantly be switching between the high and low states. The negative 4 volt power supply must supply about 6ma maximum. C14 must be large enough in capacitance to maintain a ripple of a couple tenths of a volt. This requires PB3 to change at least every $$$ 7ms. Some notes about the charge pump: Diodes D2 and D3 are specified as 1N4448's to squeeze a few tenths of a volt out of the losses in the charge pump. In most cases, more common 1N4148 or 1N914 type diodes will work fine. If you want even better negative drive voltage you can change D2 and D3 to 1N5817 Schottky rectifier diodes to get about -4.6 volts out of the the charge pump for the EIA-232 driver. LED and PZT ----------- LED D4 (on PB5) is lit to indicate the presence of closed caption on the video signal. The LED is on when the port pin is driven high. The LED current is a maximum of 3ma. This is sufficient for most lighting conditions with a high brightness LED. The function of the LED is completely under software control. A PZT speaker can also be placed on PB4 to beep at power on time. Note that when using an ICD2 for debugging, PB4 must be low, so the PZT function should be disabled by changing the definition of the has_pzt conditional near the top of the source file. Software -------- The main loop of the program polls a flag to determine whether a line 21 sample set is available, and if so, demodulates it and outputs the result. The main loop also attemps to receive characters from the serial receive buffer ring, and store them into the command buffer. When a carriage return character is recieved, the command processor is called. The composite sync interrupt is used to control the timing and data sampling operations of the closed caption decoder. The composite sync interrupt handler handles DC restore and counts scan lines. On lines other than 21, it polls the UART receive and transmit. The UART is not configured to generate interrupts because that would introduce non- deterministic latency in the composite sync interrupt handler. On line 21, the composite sync interrupt handler clamps the peak detector, delays until the approximate time of the start of run-in, and releases the peak detector clamping. It then collects 136 data samples into a 17-byte buffer called "sample". The sampling code is written as a series of inlined pairs of instructions like this: rrf datap,w ; get first bit of sample+0 rlf sample+0 datap is defined by an equate to be the comparator control register. The LSB of this register is the output of the data slicing comparator. The first instruction of the pair is used to copy this bit to the carry flag of the status register. The second instruction rotates the carry into the first byte of the sample buffer. This pair of instructions is repeated eight times to acquire the first eight samples. There are 17 consecutive groups of eight pairs, and each group is identical except that the offset ("+0") is incremented for each successive group. An alternative means of sampling could use alternating bit instructions rather than rotates: btfsc porta,data bsf sample+0,7 By virtue of not using a rotate on the input port, this allows the use of an arbitrary bit rather than requiring the LSB, and it will allow other bits of the same port to be used as outputs. The only disadvantage is that it would require the sample buffer to be cleared in advance. The processor clock frequency is approximately 40 times the data rate. The PIC has an internal divide by four, so each data bit is approximately ten CPU cyles wide. Each pair of instructions for the sampling takes two CPU cycles, so there are approximately five samples per data bit. In principle it is only necessary to take one sample per data bit. However, that sample must be taken near the center of the data bit (or at least not near the edge). It is not feasible with the PIC to write code to determine the right time to sample on the fly, so the oversampling by a factor of five allows the correct sample times to be determined after line 21 has been read in its entirety. Note that the run-in frequency is the same as the data rate, so each cycle of run-in consists of approximately 2.5 samples high and 2.5 samples low. The run-in is intended to make it easy to get a hardware PLL to lock to the data rate and provide accurate sample times in the middle of the data bit times. The run-in signal is a sine wave, while the actual data bits are square waves, but this makes no difference to the data slicer. In this design the leading edge of the start bit is used as the reference time rather than the edges of the lead-in cycles, however since the decoding is performed in software this could be easily changed. $$$ demod After all of the raw samples are captured, the are decoded by a routine called "process". A low level function called "getsbit" is called repeatedly to retrieve successive samples from the sample buffer; the sample bits are returned in the carry flag and an end-of-buffer indication is returned in the zero flag. First the run-in and the leading edge of the start bit are located. If the number of transitions detected between the beginning of the buffer and the start bit is outside the range tmin..tmax, it is assumed that there is not valid caption data present. Since the sampling occurred at approximately five times the data bit rate, the code gets groups of five bits and looks them up in a voting table. The default table defines the result to be equal to the majority of the middle three samples, with the outermost two samples ignored. There is also an alternate voting table which may be substituted at assembly time which looks at only the middle bit. The routine "parity" is called to check the parity of each decoded data byte. Currently the code doesn't do any sophisticated handling of the control codes which are used for language selection, color, screen positioning, etc. These would be fairly easy to add. The decoded data is emitted to the serial port by calling "xmit", which puts the characters into a ring buffer. Because this unit is sometimes used with two-line LCD displays, it is desirable to prevent text from scrolling off too quicly. This is done through the use of "lazy carriage returns". This means that when the CR code is detected, it doesn't get immediately transmitted out the serial port. Instead the "lazycr" flag gets set, which will cause a CR to be sent before the next character. Commands -------- Received characters are transferred to the command buffer until a carriage return is found, whereupon the command scanner is called to identify the command and dispatch to the appropriate code. The commands are: D set debug mode N set normal mode R set raw mode F enable frame count C load initial value into frame counter S stop G go The command scanner is case sensitive, so the command must be sent in upper case. All commands must be terminated by a carriage return. The "D", "N", and "R" commands select one of three mutually exclusive operating modes, with "N" (normal) mode being the default. In Normal mode, the decoder processes the received data, and converts unrecognized control codes into CR/LF sequences. In Raw mode, the decoder does no processing of the data bytes received on line 21, but instead sends them directly to the serial port. Debug mode is used to debug the internal algorithms of the decoder. No detailed description is available. The "F" command enables the output of the frame counter at the start of each caption. The frame count is output as a six digit decimal value followed by a CR/LF sequence. The frame counter counts complete frames, which occur in NTSC (RS-170A) video at a nominal rate of 29.97 frames per second. The decoder does not internally implement any drop frame time code to adjust the rate to 30.00. The "C" command can be used to set the initial value of the frame counter. After the "C", an initial frame count of one to six digits should be sent, followed by a carriage return. Note that this is the only case in which a carriage return should be sent to the decoder. References ---------- "Build the TextGrabber", Electronics Now, November 1994, pg 31. Project circuit takes baseband video in and outputs EIA-232 caption. Some information on how caption works, interesting implementation. "Exploring the Vertical Blanking Interval", Circuit Cellar INK, April 1994, pg 24. Discusses several different text and graphic standards. Shows how to decode several of the text formats. Good information to get started understanding closed captioning. "Closed Captioning with the Motorola 68HC05CC1", Circuit Cellar INK, May 1993, pg 12. Uses "custom" microprocessor for placing captions on television screen. Hard to experiment with special processor. Good references in article. $$$ need more here "Line 21 Data Services for NTSC", Electronic Industries Association, EIA-608 1992. $$$ rewrite description $$$ PBS document... ???