View Full Version : Possible Solution to Auto-Cutting Commercials.

06-18-2003, 10:40 PM
Ok, let me start by saying that I have been mulling this one over for 13 months now. I wanted to try and create a program to do this myself, but my skills are limited, and while I have been learning a lot of programming, I just can’t wait any longer and all the tools seem to be sitting out there. My job is non-computer-related, and I typically work about 13-14 hours a day, which does not leave much time for programming. My guess is that JDiner, Cwingert, Riley and the Tivo PHP guru (Enigma, I think) could have this working in a matter of days. I have always had an aversion to asking someone to write a program for me, but I think this idea really contributes, and I encourage everyone to ignore me if you disagree. Other solutions for commercial cutting (like looking for black frames) seem to be iffy, and I think this solution is at least a little less iffy (at least for Prime Time shows). So here goes.

Thumbnail sketch: My idea is a central database online that, for any given show, keeps track of which frame each commercial break starts on and which frame it ends on. Think of it as a “CDDB” for commercials. We are all cutting commercials out to make Divx files or to go to DVD, so why duplicate each other’s efforts? If we can get Josh or Riley or Cwingert (or any one of them) to integrate it into their proggies like CDDB is in Winamp, for example, all of our workflows will get shorter. Commercial cutting is the most time-consuming aspect of my flow right now.

1. Necessary Data to upload to the database: I think we’d need to upload the name of the show, the name of the episode, the original air date, the record date, the station callsign where available (more on this below), the commercial placement info (duh), NTSC or PAL (or something like US vs. UK, etc.) and a few more tidbits that I will mention below. We will need file name conventions so that requests for commercial data can be automatic. (Something along the lines of Riley’s naming method for mfs-ftp files – although I agree with others that having the FSID at the end is better for sorting). Also, I think you’d need a Tivo/Turbonet to take full advantage of my system.

2a. Biggest Hurdle: Timing issues. How do I make sure the commercial starts at the same frame on your Tivo as it does on mine? Undoubtedly right now it won’t. But Tivo has already given us a solution. We synchronize our Tivos! (“Tivos of the world unite!”) Have you ever watched your Tivo dial in? What’s the first thing it does? Sets the clock. It does this by running a program called “ntpdate,” which sets the clock based on an ntp server at Tivo. Tivo uses the ‘skew’ (I think that’s what its called) feature, which slowly realigns the clock after calling in, but for my purposes we’d have to have the update be instantaneous (which is possible - I have tested manual ntpdates extensively). The beauty is that there are ntp servers all over the Internet, and Tivo’s ntp server is set the exact same time as all of them. Now, my Tivo’s clock sucks and we need them to be accurate for this to work (remember: a frame is only 1/30th of a second or 1/25th for PAL). It only takes my Tivo about 5 minutes after setting the clock to be off by 1/30th of a second. So we would have to update the clock either (i) once every 5 minutes – which is really no big deal if it is automated, or (ii) at :28 and :58 of each hour (assuming you only Tivo “standard” shows that start then). Imagine when “Alias” comes on, all of our little red lights on all of our synchronized Tivos coming on simultaneously across the country – brings a tear to your eye, doesn’t it?

2b. Padding: I think all the data we need to account for padding is now available. If you look at Riley’s xml data, everything is there. My important shows are padded to start a minute early. When I look at the xml data, it says the show started a minute early. Ending late should be irrelevant. The clients will upload whether a show is padded and by how much when they upload the other data. Then, either our server or our clients can just do the math to take padding into account. Padding after a show should be irrelevant.

2c. Time Zones: Obviously, time zones create a little wrinkle. The fix is to either submit your time zone with your request for data so the server can send you data relevant to your location or use some sort of lookup table using the call sign of the recorded show to try and find a database entry with the same callsign (someone in your viewing area) or one geographically located nearby.


06-18-2003, 10:41 PM
3. Participation: Of course any method that relies on people sharing (think of Gnutella, for example) has potential free-rider problems. The database will only be as good as the data people upload into it. If everyone just wants to download and not upload then we’ll have problems. I can think of a few solutions. 1. Have a quota system. You have to upload commercial break data for one show for every 5 you download. Not ideal, but it might be the best starting solution. 2. Make it VERY easy to upload. If a little program could parse my VirtualDub.jobs sylia file or my tytool .cut file and get the frame data out and upload it all in one click, I would definitely do it. Even better, though, would be to implement the uploading directly from your Tivo via a program running on the Tivo. Here is how it could work: When you are watching a show, at the commercial break you pause and frame forward or back a couple frames to find the exact end of the segment and then press Clear-1-Clear (or something like that– Zirak is the master of non-invasive remote presses – and cwingert is working on getting Tivo to ignore the presses you want it to). A program registers that that was frame x. After the break you pause, press Clear – 2 - Clear and move on. Tivo then uploads the break data to the server. (I am not sure if we have a way to get Tivo to output what frame it is currently on – I know we do in PC programs though). If we make it as easy as a couple extra button presses while we are watching, people won’t mind doing it (and besides, then you can train your wife/girlfriend/kids to do it and shove the dirty work off on someone else).

4. Reliability: I also think there is a potential for garbage in the database. We would probably want to get data from at least 3 sources per show and do some sort of averaging to make sure the data is correct before letting people download it. Ideally, the server will keep its clock 100% accurate with ntp updates, will compare its clock with each uploading Tivo’s and will mark data as suspect if it comes from a Tivo whose clock is off. If you are uploading from a computer, perhaps your computer could check in with your Tivo first to make sure their clocks were synced before uploading so, using a computer as a “middle man” wouldn’t throw off the system. Also, on the PC client (and maybe on the Tivo?), we could add a “Rate this data” Button as a way to tell the server that a set of cuts is bad. Uploaders could then be ranked on the historical reliability of their uploaded data and the server could dole out the submissions of the historically most reliable uploaders first (everything would remain anonymous other than to the server, though). Of course, you could have people who intentionally try and poison the database, too, and this last suggestion would help weed them out. Another alternative is to have a log in or registration, etc to prevent poisoning.

5. Uses: This seems obvious, but I’m on a roll, so what the hell. 1. Cutting commercials out of DVDs and Divx files. This is the most time consuming piece of any process because you can’t automate it. You need your eyeballs. This method just lets you borrow someone else’s. With this solution you might still have to watch things for a while (Unless the db gets huge and very accurate), but at least you could load the cuts in from the start and maybe only move them a couple frames “left or right” for each cut. I think this would substantially speed things up. Also for some people (like me) who are happy with “close enough,” this could completely automate the process (it seems to me the “search for black frames” method will also only be “close enough” and could lead to mistakes if there are black frames in the middle of a segment (like on “24”)). Also, (and I do not know if this is possible, given our current knowledge) assuming we could tell the Tivo to skip to a certain time or frame in a file during playback, we could have Tivo request the cut list for a show from the database as soon as we start it (like CDDB does when you pop a CD into your computer) and then autoskip commercials from there on out.

Anyway, that is my idea. I’d like to hear everyone’s thoughts. I think it could work, but let me know if you disagree.


06-19-2003, 06:25 AM
I'll start by throwing this out. What's to guarentee each network to start the show at the same time? I think this idea will not work because of the time sync problem.

My ideas are this. Use a voting solution to guess where it is. Use black frames, volume sampling, and Closed Captioning info to determine where the commericials are.

Replay TV has had this technology for years, but it still has issues. Network TV doesn't want us to do this....

my 2 cents,

06-19-2003, 06:45 AM
Just as a side point would it be more feasible to create an app that you pass in an image of what a commercial break looks like and it outputs a list of cut points.

I know tystudio and tytools have the capability to dump an image so maybe its possible to compare this against an mpeg stream. Havn't a clue how long this would take or even if its possible. Sounds like the sort of thing someone must have done already with image recognition software.

Anyone know if this would be possible?

06-19-2003, 07:22 AM
Lloyd -

I think for prime time tv (i.e., the shows I Tivo), they are all network feeds (i.e., all should start at the same time). Now, there are definitely local commmercials, but they are scheduled in. The affiliate knows that break 2 is all local and they put in their commercials. They also know that they get 2 minutes of black to insert their commercials and then the show is back on so they have to go back "live" when the show is back on. They don't get tapes of the shows in the mail and start them around 8 or so. It's all off the feed. Directv should be even moreso this way.

06-19-2003, 09:02 AM
One thing you need to take into account is that all commercial breaks don't always coincide with I-frames at the black frame that occurs between the program and the commercial. If you've used any of the editing programs (i.e. TyStudio or TyTool) you would have noticed that you can't always get a cut to occur on the black fram because of the way the headers are arranged. If you try to automatically cut on the black frame I'm guessing that you'll probably end up with GOP errors or serious synch issues.

Most of us with DTivos don't let them dial in but the clock timing comes down with the guide data anyway. Commercial breaks don't occur at set intervals but rather where the studios decide a cut point should occur in the story line. Every time I edit commercials in an episode of Seinfeld or Babylon 5 they always occur at different points in every single episode. As such it would be impossible to synch up a commercial cutter to automatically set the start and stop cut points at set times without hosing up the program itself. If it actually clipped only the commercial it would be a case of dumb luck.

06-19-2003, 09:08 AM

I know that the cuts would have to be on I-frames for using the current tytool, tystudio methods to go to dvd. I understand that if you were using the tivo to find your commercials, you might end up not being on an I frame, but if you were using tytool or tystudio, your cuts would only be on I frames and after you made your cuts, you could upload the info so that i could then use your info to make my cuts. I don't think it would be too difficult to add in a little logic to find the closest I frame to the downloaded cut point and make the cut there instead.

Also, I am not suggesting that every Seinfeld has the commercials in the same place. The database would be episode specific - So "Seinfeld - The Contest" would have different data than "Seinfeld - Kenny Rogers Roasters." Syndicated shows would be tough, though, which is why I suggested we upload the original air date and the record date. Syndicated shows are not off of feeds so the start times would not be frame specific. The commercials should still be in the right places even on syndicated shows, though. (I.e., once you flag the start of the first break using your own eyeballs, all the others should line up with the data that is downloaded.) With Tivo, many of us actaully record, extract and archive the show when it is first run and it is here that I think this method could be useful.

Finally, wrt clock data - the point is that you don't need to align your clock with tivo or from its server. Any ntp server on the internet will do because they are all synced. You wouldn't need to dial in - you would just run ntpdate and point it to any one of the myriad public ntp servers out there for clock updates.