Extracting Captions

Many videos have closed captions baked into the data itself. This recipe shows how you can use the TV Kitchen to turn these hidden captions into a stream of text.

Components

Packages

PackageDescription
@tvkitchen/countertopThe entry point for developers who want to set up a TV Kitchen.
@tvkitchen/appliance-video-file-ingestionConverts a video file into MPEG-TS Payloads.
@tvkitchen/appliance-ccextractorExtracts captions from MPEG-TS Payloads using CCExtractor.

Output Streams

TypeDescription
STREAM.CONTAINERChunks of MPEG-TS data.
TEXT.ATOMA stream of characters.

The Recipe

Estimated Cook Time: 5 minutes

Preparation

You will need:

Instructions

Step 1: Set up a new project

mkdir my-recipe
cd my-recipe
yarn init

Step 2: Install TV Kitchen components

yarn add @tvkitchen/countertop
yarn add @tvkitchen/appliance-ccextractor
yarn add @tvkitchen/appliance-video-file-ingestion

Step 3: Write some code

Open a file called index.js and import the TV Kitchen packages.

const { Countertop } = require('@tvkitchen/countertop');
const { CCExtractorAppliance } = require('@tvkitchen/appliance-ccextractor');
const { VideoFileIngestionAppliance } = require('@tvkitchen/appliance-video-file-ingestion');

From here, create your Countertop.

const countertop = new Countertop();

We have to give the countertop some video to process. This is done by adding a video ingestion appliance. This recipe will use the VideoFileIngestionAppliance, but you can actually use any appliance that produces STREAM.CONTAINER data.

For instance, you could use the VideoHttpIngestionAppliance.

The VideoFileIngestionAppliance has one required parameter: filePath. If you don't have a sample video file that contains closed captions, you can use this one. Just put it in your recipe directory in a file called sample.ts.

countertop.addAppliance(
  VideoFileIngestionAppliance,
  {
    filePath: './sample.ts'
  }
);

Next, set up the CCExtractorAppliance, which will watch for video data and turn it into captions.

countertop.addAppliance(CCExtractorAppliance);

Everything is set up, but we should do something with the resulting captions. For this recipe we'll just output them to the console, but there are plenty of more interesting possibilities.

countertop.on('data', (payload) => {
  if (payload.type === 'TEXT.ATOM') {
    process.stdout.write(payload.data);
  }
});

Add some code to start the countertop.

countertop.start();

Finally, run your script!

yarn node index.js

You should see a stream of captions within a few seconds.

The Result

index.js

const { Countertop } = require('@tvkitchen/countertop');
const { CCExtractorAppliance } = require('@tvkitchen/appliance-ccextractor');
const { VideoFileIngestionAppliance } = require('@tvkitchen/appliance-video-file-ingestion');

const countertop = new Countertop();

countertop.addAppliance(
  VideoFileIngestionAppliance,
  {
    filePath: './sample.ts'
  }
);

countertop.addAppliance(CCExtractorAppliance);

countertop.on('data', (payload) => {
  if (payload.type === 'TEXT.ATOM') {
    process.stdout.write(payload.data);
  }
});

countertop.start();

Troubleshooting

Kafka

Remember that for any recipe to work you need a running instance of Kafka. By default, TV Kitchen assumes Kafka will be available locally (127.0.0.1) on port 9092. If you want to use an existing Kafka server somewhere else, you can pass kafkaSettings to the countertop to pass a different configuration value. TV Kitchen is using KafkaJS.

Something Else?

If you're stuck, check out the help page!

A Precooked Version

Don't want to implement this recipe yourself?

Clone the cookbook.

git clone https://github.com/tvkitchen/cookbook.git

and run:

yarn kafka
yarn start extracting-captions

The code can be found here.


Join newsletter

Occasionally we send out emails with news about the project. Sign up to receive our newsletter.

Tune in

Catch us on chat

Stop by to say hello, ask a question, or share a creation.

Join Channel

Email us

Avoid the overwhelming pressures of real time chat and send a note directly to tvkitchen@biffud.com.

Email us