PDA

View Full Version : Jpag'alike script generation



stryder
06-28-2001, 10:39 PM
I am curious how many people are actually working on this, what state everyone's stuff is in and how much work is left :)

My version of the tvguide.pl script generates most of the addShows section but I don't have any of the addChannel section anymore. I haven't looked at the detail section yet. I am only handling the current date and not handling the case where the time asked for crosses into the next day.

TVguide.com seems to be testing a new layout for their page lately so I don't know if this script will continue to work without modification.

stryder
06-29-2001, 08:43 AM
I just noticed a problem with the tvguide perl script. This problem lies in its span time slots operation. I noticed in my output I am getting shows chopped into two pieces depending on where they lie on the time line. I don't know what kind of problems this would cause if this data were input into the tivo. If the show lies completely within a time block it will not be chopped up. It does get the duration correct, but only if you add the two pieces together.

If anyone has a quick fix for this problem it would be nice. :)

sir_lunatic
06-29-2001, 11:46 PM
Remember, we are modifying the script so that it will get 7-14 days worth of data. So start your script so that it always pulls data starting at 12:00am or 00:00 military time. It will pull in 2 hour increments. This is ok. If the title has "<<" at the beginning, drop it. Who cares about data from the day before we started the data collection.

Now as far as details collection. Once you have an array with the channels, title etc... You can call "show_detail()" sub on each title in the array. Only problem is that this sub also calls "load_data($url)" which also stores the data in $tvg::INPUT where the time grid is originally stored and you need this data to get the new data. We have to create $tvg::DETAILINPUT and parse that for show detail. You can get the show duration in the details, just round up to nearest hour/half hour. Hmmmm or not.....Maybe not a good idea to round up.

stryder
06-30-2001, 12:17 AM
Show duration isn't a problem, I can already get that reliably. The problem is getting the show's starting time :). Without that the rest of the details are worthless. Unfortunately it doesn't seem like the html code contains the actual start imes of the shows. The details do however, but they may contain more than one show date.

The script actually retrieves data in 2 hour chunks. If you tell it 24 hours of time, it actually does 12 retrieves of the website data. The merge listings subroutine is supposed to merge all the shows that break these two hour chunks back together, however it seems to be broken.
The problem that happens with this is that say you have a movie that starts at 1pm. Now since its a 2 hour movie it breaks the 2 hour barrier. You end up with the first hour of the movie as the first "show" and the second part of the movie as the second "show". So you get two entries for the one show. I am working on a method to solve this though.

I just realized I said what you said again. Oops.


It blows my mind how many views this thread and the one it started in are getting. :)

Dinosaur
06-30-2001, 11:26 AM
Following the thread, going to be working on it.
Mere rabbit so far, but once I come up to speed with this sucker I promise to help!
20+ years of Database experience, if we can't crack MFS I'll be surprised :).
Bear (bare??, whatever (sp!)) with me! Stick with it, we're doing good so far, Rome wasn't built in a day!!

Dino
(goddam typo's, never post after > 2 bottles of wine!!)

sir_lunatic
07-01-2001, 12:45 AM
I may have found a start to getting the start time.

towards the end of "sub parse_shows" you will see a line:

#print "[$_]<$tabs>\n";

uncomment this line; remove the "#", save and run the script

It will print out each channel line, followed by each show on the line and how many tab spaces it occupys in the final output.

Per this script, it will only show 2 hours per screen, and per the comments in this script, he uses 24 tabs per line equating to 5 minutes per tab.

So if the start time for this first page is 00:00 and the first channel has two shows, the first occupying 6 tabs and the second 18 tabs, 6 times 5 equals 30 minutes, the second show starts at 00:30.

Do what I said above and look at the output, you will understand when you see it.

iceberg
07-01-2001, 10:12 AM
Just to give you guys a heads up. Supposedly works with the new version of the tvguide site:

http://www.cherrynebula.net/projects/tvguide/tvguide.php

Ps. I should bet getting a refurb 312 this week. I will start on working on this project as soon as I get the hardware in. Enjoy.

stryder
07-01-2001, 02:32 PM
Cool thanks for the tip I will give it a shot and see what it does differently. On another note I seemed to have screwed up my tivo partition, so I am now attempting a restore from backup.

sir_lunatic
07-02-2001, 04:21 PM
ok to get start time:

I started with an unmodified tvguide script.

replace "sub print_layout" with the following:

--------cut here --------
sub print_layout
{
return if ($#tvg::listings < 0);

my ($x, $st, $ch);

print "\n";

foreach $x (@tvg::listings) {
my ($l,$slots) = split(/#/,$x);

if ($slots == 0) {
$st = 0;
$ch = $l;
next;
}
print "$ch,$l," , $st * 5 ,",", $slots * 5 , "\n";
$st = $st + $slots;
}
}
--------cut here----------

i ran it the following way

./tvguide -span 24 -date 07/03 -time 00:00 > listing

this yields channel,title,epoch time in min, length in min

the epoch is from the time in minutes the guide starts.

I now see that the title of the shows is going to have to be retrieved from the details due to a truncating problem with cells on tvguides web sight being too small.

sir_lunatic
07-02-2001, 04:47 PM
Now as far as the problem of the show splits, we would need to get the real show title from detail, as well as length from detail.

Check to see if a show appears twice in adjacent time slots and compare their length to the one retrieved in detail, if they are smaller, they need to be combined as one entry.

loop,loop,loop..........

shouldn't be too bad.

stryder
07-02-2001, 05:00 PM
I am going to add your changes to my version of the file. I modified the print_channel_line function. It allowed me to print out the show numbers. I also used a perl module called Calc::Date

sir_lunatic
07-02-2001, 05:06 PM
Actually, if we stored the show number instead of the title in tvg::listings, it would make life simpler. Then it would be easier to get details of each show at the end.

stryder
07-02-2001, 05:13 PM
I didn't need to store it because of the fact that the script already loads them in sequential order into the @tvg::listings . However that might be a very good idea. I like your method of computing the start time of the show. Keep in mind the scripts Jpag produces use times in terms of seconds.

stryder
07-02-2001, 05:31 PM
Looks like TVGuide has screwed with the page layout again.

Hey by the way it appears that the beta 1.8.6 version of the script might have fixed the short titles. Maybe.

sir_lunatic
07-02-2001, 09:24 PM
His 1.8.6b code works fine except for one bug I can find so far.

It appears to drop the first channel number in the list. To fix, do the following:

Near the bottom of "sub parse_listings"

Change:

s/>//g;
s/\s+$//;

next if length($_) < 2;


To read:


s/>//g;
s/\s+$//;
s/\:clist//g; # Add this line.

next if length($_) < 2;


This fixes the bug.

Hopefully they will stick with this code this time.

stryder
07-02-2001, 09:28 PM
I think there is another bug. Well at least on my computer If I do a span of 12 hours I get the first two shows repeated for the whole rest of the time slots. Do you?

sir_lunatic
07-02-2001, 09:48 PM
Yup, same problem..........

Back to the drawing board........

sir_lunatic
07-02-2001, 09:52 PM
His date and time isn't working, it continuously gets the current date and time.

stryder
07-02-2001, 09:56 PM
Well that is kinda a good thing, because we want to compute the time difference for the start time from 00:00 time.

Do I need to worry about the situation where a guide information crosses the midnight barrier? Where the day changes? The reason I ask is right now my code is assuming that you are looking at the current days date to compute the Day offset.

Vadim
07-02-2001, 10:11 PM
stryder,

Please check your PM.

sir_lunatic
07-02-2001, 10:30 PM
Confirmed, the date, time function no worky. Their site continuously only shows the current date and time for the script.

They changed how to request a specific date and time.

Will look into it further.

stryder
07-02-2001, 10:35 PM
Do you think they are doing this to combat us specifically?

sir_lunatic
07-02-2001, 10:53 PM
No I dont think so.

I saw today that TvGuide and MS has teamed up and they are redoing their servers to run on MS platform. If you notice the url's there are now .asp's

So we will figure it out.

So far I found that they have renamed the post variable names. Date and Time is no longer &ST they are &event_date and &event_hour.

But still can't get them to work in a manual url in a browser.

stryder
07-02-2001, 11:01 PM
I hope that doesn't mean that the servers won't work very well :)

I was hoping we would have a reliable source of guide information.

TheDoctor
07-03-2001, 02:44 AM
I have got my first Tivo a little over a week ago, and have been following this thread since. I thought I would use a different approach and try to write the whole thing in visual basic over the weekend. I had used sites other than TV Guide as a source, and began working with manually saved html files. (4 files per day covering 6 hours each. Handling cookies directly from sockets is a task I will have to complete later, it was giving me a headache.)

I got to build the script with all of the data except the genre types, including irregular start and end times. I decided to merge a second source of data to build a composite between the two web pages to get better info, (actor, episode title, director etc…) although I will have to work to figure ould how to load it…)

I have had to move the data from memory storage to an access db, and have yet to attempt a script build from the new source.

Currently I have it to the point that it will check for available files store all parsed elements into the database. It appears to be handling date rollovers correctly in the database. But I will have to rebuild script generator to work with the new db and load it to verify. I am currently building data for 10 and 11 july.

Even once I get it to pull the data directly from the web pages, I have doubts that it will be ‘releasable/maintainable’. In addition to general windows dll/db issues, it is simply to easy for the provider to embed control characters or extra html to break things. I had to handle several irregular entries in the early testing. Once any script is distributed, it will be targeted by Tivo for obvious reasons, and by the data providers, because no-one is clicking on their banner ads….

One alternative for some, might be to write a Tivo resident program to start recording when a video signal is supplied, as some VCR do. This would allow those with Receivers/Cable boxes that support this feature to be used as the ‘guide’, and the ‘now playing’ titles could be changed later from a default date/time filename. Just a thought.

sir_lunatic
07-03-2001, 08:48 AM
Hey, the more options the better. We chose to doit in perl since it could then be run on any platform, whether it be MS, *nix or mac.

By the way, it looks like tvguide's server farm is running different versions of their guide on different servers.

aram1s
07-05-2001, 09:02 PM
Ya know, I wish I had discovered this site before today...

I have a mostly (90%) working perl script than scans the tvguide site, grabs all of the channel numbers/names, then grabs all of the show ids for the timespan i am looking at, then grabs a detail on each show, whih also gives all of its various time slots.

Then it outputs the channel list in jpag format, followed by the show times followed by the descriptions. It has already been modified to work with the new asp pages.

But I have been unable to successfully test it, i think because of the fact that the unit i have been messing with is deactivated. Do i have to flip the ServiceState is /Setup to something besides 3? Something else perhaps?

Any help would be appreciated.

A

Pique
07-06-2001, 02:54 PM
aram1s:
I don´t think ServiceState would do any help on that matter, it´s good to avoid nag screens. Lifetime=5.

Don´t know if it is the same, but I been having trouble with the addProgram section, it seems that if you don´t have the service, the /StationDay section doesn´t get updated in order to place the date (StationDay type) entry under every channel FSID entry to link the program to the shows schedule. The JPags's runupdate.tcl script doesn´t consider to create that type of entry, but Program type do to /MyGuide/Programs.

did you see this too?

Sony SVR-2000 1.3

P.

cwingert
07-07-2001, 04:01 PM
Is this script available for download?

Thanks

trainboy
07-22-2001, 11:26 AM
If you are having trouble with the start/end times for programs that span half-hour intervals, one solution is to reconstruct the entire day's worth of data on your local machine.

To do this, employ a slot map with one slot for each five-minute interval throughout the day and one row for each channel. As you retrieve the data, fill in the slots using the database key for the program as the slot value. The column width number in the guide data gives you a direct value for the number of slots to fill in (i.e. beginning with the two hour period that you requested guide data for, a column width of 5 means fill in five slots [25 minutes] from that time forward, etc.). I did a calculation based on 125 channels and four byte slots and it comes to under a megabyte per day for slot data.

Meanwhile, whenever you see a title for a database key, add it to a lookup table (you could use a hash but I'd put this into a DBM database). As the titles trickle in, you can check to see if you've got better data (e.g. for a slot that starts at 1:55, the title will be "B >>". Later on, when the next two-hour chunk of data is gotten, the real title will be "Bedtime for Bonzo"). I'd suggest that a good algorithm for determining "better" is to check for longer. Any title that is longer is better. After you've gotten the whole guide, you should have good titles for practically everything (except for "Jack Stargazer" who always shows up as "J..").

(Continued next post -- 2K bytes limit is a real pain in the ass)

trainboy
07-22-2001, 11:29 AM
(continued from previous post)

One concern that I have is not getting noticed by TV Guide. If you hit their server too often, the may discover you and block you. To avoid this problem, I was thinking of running the program to get guide data every two hours and pulling a single two-hour chunk from them. I'd set it up to pull guide data for two weeks from now. Since I have a network-attached Linux server, this is a simple thing to do via cron. The slot map and title data can be saved in a DBM database which is updated every two hours. Over the space of a 24-hour period, you'd pull a whole days worth of guide data without even raising a blip on the radar (especially if you get a new IP address every time you dial up like I do).

OK, back to the title data. I was thinking of putting additional information in the title database such as show description, actors, etc. Only problem is that calling up the show detail for every show will cause a whole heap o' hits and get me noticed again. To avoid this, I was going to read a config file that listed only the shows I cared about. For example, all movies, every episode of "Buffy the Vampire Slayer" and "Days of Our Lives". For all of the other shows (e.g. Jerry Springer), all I need is the title because I couldn't care less about them. One could just pull the show details when the guide data is gotten but, if one is paranoid, one could pull them at random intervals throughout the two hour period (determine how many show details are needed and then divide this number up into say 1:45 randomly and proceed). Incidentally, if you care, now is the time to fix up "Jack Stargazer". If a title is shorter than a certain threshold (e.g. 5 characters), always pull its show detail and use the title from there.

(continued next post -- 2K bytes limit is still a real pain in the ass)

trainboy
07-22-2001, 11:30 AM
(continued from previous post)

Alright! At the end of the day, I have a complete slot map plus all of the title and detail data that I need. Using this data, it is a simple matter to generate TiVo program data in whatever format the database gurus decide will work (the best would be to set up an httpd on your Linux box and have the TiVo pull it from there as if it were the mother ship -- for the TiVo it would be business as usual and nobody the wiser). Problems with spanned time slots, truncated title data, etc. are non-existant. Hopefully, nobody at the content provider notices us and shuts us off.

There remain a few housekeeping chores to take care of. Once a day, I'd delete the slot map for yesterday (keeping it around until it is history is not a bad idea in case one needs to reconstruct data already sent to the TiVo for some reason). Also, the title database needs to get cleaned up. Either one figures out how to delete the titles when they are no longer referred to by the slot map or they get timestamped so that they can be reused. The first approach is probably the easiest to implement, it merely requiring a daily scan of all existing slot maps to determine what is referred to and garbage collecting the rest.

I've thought about doing most of this but then realized that other people had beat me too it. So, I'm just going to wait and see what develops. However, couldn't resist throwing in my two cents worth on the off chance that they might help.

trainboy
07-22-2001, 12:15 PM
Incidentally, I already got a fair piece of this code working before I noticed what was going on in this forum. I have stopped work on it because I have about 15 other things that I'm working on. However, if anyone thinks I should finish it, I'll do so. I actually do this stuff for a living (yes, that's right, pulling HTML crap off of Web sites and turning it into useful data) so I'm pretty familiar with all of the ways that the content provider can break working scripts. I just don't want to duplicate everyone else's work.

I see from another thread that people are working on decyphering slice files. That would be my choice. Just generate the slice files from the data pulled from the content provider and let the TiVo get them from httpd. Most of the work gets done on a Linux server and TiVo just thinks its doing its usual thing.

Maybe we should divide up the work. Who wants to take care of getting the guide data? Who wants to decypher the slice files? Who wants to write the Web pages for Apache? Solve the problem only once?

TiVo_Canada
08-07-2001, 01:02 PM
My Jpag's "runupdate.tcl" script is generating errors under v1.3. Here is the dialog. Can anyone offer any assistance. Looks like /StationDay/ is not be handled properly.


quote:
--------------------------------------------------------------------------------

08/06:14:10:38: ./runupdate: adding channels
08/06:14:10:38: ./runupdate: addChannel: TVO
08/06:14:10:38: ./runupdate: processing channel list
08/06:14:10:38: ./runupdate: adding programs
08/06:14:10:39: ./runupdate: addProgram TVO News TVO News with Joe Blow.
08/06:14:10:39: ./runupdate: Searching for TVO-NEWS in /SeriesTitleGS
08/06:14:10:39: ./runupdate: Searching for TVO-NEWS in /MyGuide/Programs
08/06:14:10:39: ./runupdate: TVO News not found - creating /MyGuide/Programs/TVO-NEWS
08/06:14:10:39: ./runupdate: created 28806
08/06:14:10:39: ./runupdate: updated showing TVO 0 with TVO News 11540 52200 5100
08/06:14:10:39: ./runupdate: /StationDay/0/11540
08/06:14:10:39: ./runupdate: Cannot open station/day /StationDay/0/11540 can't open object (errDbNotFound)
- skipping

--------------------------------------------------------------------------------

Here's showlist:


quote:
--------------------------------------------------------------------------------

proc addChannels { } {
addChannel TVO
}
proc addShows { } {
addShow 0 TVO 11540 52200 5100 {TVO News}
}
proc addPrograms { } {
addProgram {TVO News} "1 371 133 105" "2001" "" {TVO News with Joe Blow.}
}

--------------------------------------------------------------------------------

TheDoctor
08-08-2001, 07:46 AM
THe Last time I check, the Jpaq script was updating existing station day objects, not creating them. from tivosh issue 'dumpobj /StationDay/0/11540' if your see (errDbNotFound) that would mostlikely be the problem. You will need to creat the StationDay object with the Showing members.
(I am sending you an e-mail with the StationDay structure from 2.0, 1.3 does not appear to allow wildcard or -depth options when accessing the StationDay object.)

TiVo_Canada
08-08-2001, 12:28 PM
I'm confused why these StationDay objects would not already be created. How is everyone else able to run this script with new channel info etc..

I'll wait for the email.

Thanks.

TheDoctor
08-08-2001, 01:20 PM
The the slice file is loaded a station day is created with is created with channel 100 say that pointes to a Program record with a title of 'Pay-Per-View Preview'. By default the StationDay object shows each of this items running for 4 hours. The Jpaqs scripts modify creat Program records in a new MFS directory /MyGuide/Programs then points the existing station day object to it.
From the log you provided you are attempting to load a channel with an record number of 0. If you can do a dumpobj 0 and get a valid station ID great, (As far as I know there are no record number 0). Every record added to the tivco get a object number that is specific to that tivo. A station ID may be object 2081 on one machine and 10328 on another. The slice files use a use a servier ID that is consistant across machines to reference each object, and that is translated by your tivo into a local ID during load.
By creating a seperate object in a new directory, the Jpaq script bypasses the nead to look up each master record. As it adds new records it adds the ID to a short index key that it is keeping in memory. It is designed to load a single days of data to a handfull of channels. There is at most a few hundred entries.

THe StationDay object gets a special pointer when it is creaed. In 1.3 it is /StationDay/(station object number)/(day code) on 2.0 it is /Schedule/(station object id):(day code):(start time of record collection):(end time of collection):( and other stuff). These pointers point to the real objects.

TheDoctor
08-08-2001, 01:23 PM
I am not as unhappy as my last message indicates, It is translating collons as frowns.

TiVo_Canada
08-08-2001, 05:01 PM
Okay - if I understand you correctly the objectID that I returned for my channel might be the problem here.

I thought I was going to create a file in /StationDay/slice/program rather than an entry in the dB

What I need to do is get the objectID for the channel in question.

I'll use dumpobj to get my channel info and work it back from there.

Is this correct?

TheDoctor
08-09-2001, 08:21 AM
This is from one of the old Jpaq scripts:

proc processChannelList { } {
global chList hdb

ForeachMfsFile id tempname type "/StationDay" "" {

RetryTransaction {
set sobj [db $hdb openid $tempname]
set ssign [dbobj $sobj get CallSign]
} set index [lsearch $chList [list $ssign 0]]
if {$index >= 0} {
set chList [lreplace $chList $index $index [list $ssign $tempname $id]]
}
}
}

It appears that this portion should lookup a object id based on the callsign. …I am no expert on the Jpaq script… Any statements I make about how the Jpaq script works are based on a glance at the code and trying to compare it against what I have observed about the object structure. I have never used it the script. From your log I would guess that it is not finding TVO callsign and is assigning a 0 for the objectid. I would say that using dumpobj to find the correct ID would be a good place to start. It appears that you are creating a program object correctly. I believe for the script to run you will need to have the callsign of a valid station already in tivo, and a station day already in tivo.

I …BELIEVE… that the script loads the first creates list of callsigns in memory(chList), then looks up the object ids for those callsigns. It then attempts to find program data, creating new records if needed. The object ID for the program is then added to memory as well (prList). Using the ID’s in memory (chList and prList) it then attempts to modify existing showing objects under the stationday object to point to the alternate program data and modifies the running time properties. If there are not enough showing objects, then it creates them.

TheDoctor
08-09-2001, 08:21 AM
That is what I …THINK… it is designed to do. I don’t know if the old copy of the script I have is the same one you are trying to use, or what your input data looks like, or the state of the object structure on your tivo. Trust No One. Compare what I am say to what you are seeing and act accordingly. My advise is worth the paper that it printed on, unless you have actually printed this out, in which case the paper is probably worth slightly less.

The script creates new database objects outside of the primary database structure but must then link them back in, which is done with the …mfs link… commands.

Play to with it until it works or you break it beyond repair.
---
The Doctor
As a systems analysts it is my job is to beat my head against the wall until I knock myself out or I create an opening to move forward.

TiVo_Canada
08-10-2001, 04:16 PM
I'm starting to get a clearer picture of this now. Since my original Guide data expired long ago I don't have anything in /StationDay.

The only way to get it there is to fiddle with mfs and add one.

I notice that I still have expired data in /Showcase and /SeasonPass. Wonder if I can copy one of those or move it - albeit the objects are different formats.

Woe is me

DoctorW
08-16-2001, 11:11 PM
I'm working on modifying JPAG's scripts to get
the Australian TV guide (on the web) into a
format which I can load into my PAL-modified TiVo.

What does the parameter order mean in the
addShow procedure in runupdate.tcl? I've looked
at JPAG's example inputs, where it has values
0 to 12. Does it allow you to have multiple shows
with the same name?

Thanks in advance!

TiVo_Canada
08-17-2001, 09:54 PM
Hope this helps:

example:

addShow 0 CTV 11526 52200 5100 {CTV News}

0 = time slot, can be 0-x for same day show, different times

11526 = days since Jan. 1, 1970 (don't forget leap years)

52200 = # seconds since 12:00am

5100 = duration of program in seconds

{CTV News} = description

DoctorW
08-19-2001, 07:06 PM
Thanks for that Stryder. I take it the JPAGs stuff doesn't set star ratings, or advisory warnings.

Now I need to find out how to set channels up manually via scripts...

Filch
08-27-2001, 12:14 PM
I'm trying to figure out where to start/what updates to have.
I just joined in the middle of this thread...
I have the original JPAG scripts... what else do I need?
My firmware is 1.3...
Maybe someone could post a diff of their current project/a link to the differences?