Extracting Captions
Many videos have closed captions baked into the data itself. This recipe shows how you can use the TV Kitchen to turn these hidden captions into a stream of text.
Components
Packages
Package | Description |
---|---|
@tvkitchen/countertop | The entry point for developers who want to set up a TV Kitchen. |
@tvkitchen/appliance-video-file-receiver | Converts a video file into MPEG-TS Payloads. |
@tvkitchen/appliance-video-caption-extractor | Extracts captions from MPEG-TS Payloads using CCExtractor. |
Output Streams
Type | Description |
---|---|
STREAM.CONTAINER | Chunks of MPEG-TS data. |
TEXT.ATOM | A stream of characters. |
The Recipe
Estimated Cook Time: 5 minutes
Preparation
You will need:
Instructions
Step 1: Set up a new project
mkdir my-recipe
cd my-recipe
yarn init
Step 2: Install TV Kitchen components
yarn add @tvkitchen/countertop
yarn add @tvkitchen/appliance-video-caption-extractor
yarn add @tvkitchen/appliance-video-file-receiver
Step 3: Write some code
Open a file called index.js
and import the TV Kitchen packages.
const { Countertop } = require('@tvkitchen/countertop');
const { VideoCaptionExtractorAppliance } = require('@tvkitchen/appliance-video-caption-extractor');
const { VideoFileReceiverAppliance } = require('@tvkitchen/appliance-video-file-receiver');
From here, create your Countertop.
const countertop = new Countertop();
We have to give the countertop some video to process. This is done by adding a video receiver appliance. This recipe will use the VideoFileReceiverAppliance, but you can actually use any appliance that produces STREAM.CONTAINER
data.
For instance, you could use the VideoHttpReceiverAppliance.
The VideoFileReceiverAppliance
has one required parameter: filePath
. If you don't have a sample video file that contains closed captions, you can use this one. Just put it in your recipe directory in a file called sample.ts
.
countertop.addAppliance(
VideoFileReceiverAppliance,
{
filePath: './sample.ts'
}
);
Next, set up the VideoCaptionExtractorAppliance, which will watch for video data and turn it into captions.
countertop.addAppliance(VideoCaptionExtractorAppliance);
Everything is set up, but we should do something with the resulting captions. For this recipe we'll just output them to the console, but there are plenty of more interesting possibilities.
countertop.on('data', (payload) => {
if (payload.type === 'TEXT.ATOM') {
process.stdout.write(payload.data);
}
});
Add some code to start the countertop.
countertop.start();
Finally, run your script!
yarn node index.js
You should see a stream of captions within a few seconds.
The Result
index.js
const { Countertop } = require('@tvkitchen/countertop');
const { VideoCaptionExtractorAppliance } = require('@tvkitchen/appliance-video-caption-extractor');
const { VideoFileReceiverAppliance } = require('@tvkitchen/appliance-video-file-receiver');
const countertop = new Countertop();
countertop.addAppliance(
VideoFileReceiverAppliance,
{
filePath: './sample.ts'
}
);
countertop.addAppliance(VideoCaptionExtractorAppliance);
countertop.on('data', (payload) => {
if (payload.type === 'TEXT.ATOM') {
process.stdout.write(payload.data);
}
});
countertop.start();
Troubleshooting
Kafka
Remember that for any recipe to work you need a running instance of Kafka. By default, TV Kitchen assumes Kafka will be available locally (127.0.0.1
) on port 9092
. If you want to use an existing Kafka server somewhere else, you can pass kafkaSettings
to the countertop to pass a different configuration value. TV Kitchen is using KafkaJS.
Something Else?
If you're stuck, check out the help page!
A Precooked Version
Don't want to implement this recipe yourself?
git clone https://github.com/tvkitchen/cookbook.git
and run:
yarn kafka
yarn start extracting-captions
The code can be found here.