|
|
|
|
||||||
![]() |
|
|
LinkBack | Outils de la discussion |
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Okay, I've been playing with doing this programatically. I'm not yet
convinced that it's bug free, but I've found some rather long movie title chains using the non-imdb list posted earlier: 10 ITEMS OR<LESS> THAN<ZERO><DAY> FOR<NIGHT> AND<DAY> OF THE<DEAD><BANG> BANG YOURE<DEAD><END> OF<DAYS> OF<HEAVEN> CAN<WAIT> UNTIL<DARK><BLUE><CAR> 54 WHERE ARE<YOU> CAN COUNT ON<ME> MYSELF<I> AM TRYING TO BREAK YOUR<HEART><CONDITION><RED><DAWN> OF THE<DEAD><HEAT> AND<DUST> TO<GLORY><ROAD><HOUSE> OF<DRACULA> DEAD AND LOVING<IT> CAME FROM BENEATH THE<SEA> OF<LOVE> AND<DEATH> BECOMES<HER> MAJESTY MRS<BROWN><SUGAR> AND<SPICE><WORLD> TRADE<CENTER><STAGE><FRIGHT><NIGHT> AND THE<CITY> OF<ANGELS> WITH DIRTY<FACES> OF<DEATH><SHIP> OF<FOOLS> RUSH<IN> COLD<BLOOD><BEACH><PARTY><GIRL> IN THE<CADILLAC><MAN> OF THE<HOUSE> OF<FRANKENSTEIN> AND THE MONSTER FROM<HELL><NIGHT> FALLS ON<MANHATTAN> MURDER<MYSTERY><ALASKA> SPIRIT OF THE<WILD><BILL> AND TEDS BOGUS<JOURNEY> TO THE CENTER OF THE<EARTH> GIRLS ARE<EASY> COME EASY<GO><NOW> YOU SEE HIM NOW YOU<DONT> BOTHER TO<KNOCK><OFF> THE<BLACK> AND<WHITE> WATER<SUMMER><LOVERS> AND OTHER<STRANGERS> WHEN WE<MEET> JOE<BLACK> HAWK<DOWN> TO<YOU> CANT TAKE IT WITH<YOU> LIGHT UP MY<LIFE> AS A<HOUSE><PARTY><MONSTER><HOUSE> PARTY<3> NINJAS KICK<BACK> TO<SCHOOL> OF<ROCK><STAR> TREK IV THE VOYAGE<HOME><ALONE> IN THE<DARK><CITY> OF<JOY><RIDE> THE HIGH<COUNTRY><LIFE> IS<BEAUTIFUL><GIRLS> GIRLS<GIRLS> WILL BE<GIRLS> JUST WANT TO HAVE<FUN> AND FANCY<FREE> WILLY 2 THE ADVENTURE<HOME> ALONE<3> NINJAS KNUCKLE<UP> CLOSE AND<PERSONAL><BEST> OF THE<BEST><MEN> CRY<BULLETS> OVER<BROADWAY> DANNY<ROSE><RED><EYE> FOR AN<EYE> OF<GOD> TOLD ME<TO> DIE<FOR> YOUR EYES<ONLY> THE STRONG SURVIVE A CELEBRATION OF<SOUL><FOOD> OF<LOVE> WALKED<IN> AND<OUT><COLD><FEVER><PITCH><BLACK> LIKE<ME> WITHOUT<YOU> ONLY LIVE<ONCE> IN THE<LIFE> OR SOMETHING LIKE<IT> HAPPENED AT THE WORLDS<FAIR><GAME> OF<DEATH> WISH V THE FACE OF<DEATH><WISH> UPON A<STAR> TREK THE MOTION<PICTURE><BRIDE> OF<FRANKENSTEIN> MEETS THE WOLF<MAN> ON<FIRE> IN THE<SKY><HIGH> SCHOOL<HIGH><SPIRITS> OF THE<DEAD> OF<NIGHT><MOTHER><NIGHT> OF THE LIVING<DEAD> MAN<WALKING> AND<TALKING> ABOUT<SEX> AND THE OTHER<MAN><TROUBLE> EVERY<DAY> OF THE<WOMAN> ON<TOP><GUN><CRAZY> AS<HELL> UP IN<HARLEM> RIVER<DRIVE> ME<CRAZY><PEOPLE> WILL<TALK><RADIO><DAYS> OF THUNDER That's 175 titles strung together. The <> bracketed words are the ones which are the last of one title and the first of another. Lot's of two-word titles show up in this chain. By the way, a word about IMDB and data scraping. Quite a few years ago, as an exercise in learning Java, I decided to write a program which would look for "six degrees of Kevin Bacon" links between actors in the IMDB. This was a spare moment project at work. After a few days, I got an e-mail from IMDB saying that they had detected my 'robot' and had disabled access to IMDB from my ip address, which to them was the proxy server for the company (Object Technology International). So OTI didn't have access to IMDB for a few days while they considered my contrite reply and promise to cease and desist. -- Rick DeNatale My blog on Ruby http://talklikeaduck.denhaven2.com/ |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
On Jan 5, 2008 2:33 PM, Rick DeNatale <rick.denatale@gmail.com> wrote:
> > 10 ITEMS OR<LESS> THAN<ZERO><DAY> FOR<NIGHT> AND<DAY> OF > THE<DEAD><BANG> BANG YOURE<DEAD><END> OF<DAYS> OF<HEAVEN> CAN<WAIT> > UNTIL<DARK><BLUE><CAR> 54 WHERE ARE<YOU> CAN COUNT ON<ME> MYSELF<I> AM > TRYING TO BREAK YOUR<HEART><CONDITION><RED><DAWN> OF THE<DEAD><HEAT> > AND<DUST> TO<GLORY><ROAD><HOUSE> OF<DRACULA> DEAD AND LOVING<IT> CAME > FROM BENEATH THE<SEA> OF<LOVE> AND<DEATH> BECOMES<HER> MAJESTY > MRS<BROWN><SUGAR> AND<SPICE><WORLD> > TRADE<CENTER><STAGE><FRIGHT><NIGHT> AND THE<CITY> OF<ANGELS> WITH > DIRTY<FACES> OF<DEATH><SHIP> OF<FOOLS> RUSH<IN> > COLD<BLOOD><BEACH><PARTY><GIRL> IN THE<CADILLAC><MAN> OF THE<HOUSE> > OF<FRANKENSTEIN> AND THE MONSTER FROM<HELL><NIGHT> FALLS ON<MANHATTAN> > MURDER<MYSTERY><ALASKA> SPIRIT OF THE<WILD><BILL> AND TEDS > BOGUS<JOURNEY> TO THE CENTER OF THE<EARTH> GIRLS ARE<EASY> COME > EASY<GO><NOW> YOU SEE HIM NOW YOU<DONT> BOTHER TO<KNOCK><OFF> > THE<BLACK> AND<WHITE> WATER<SUMMER><LOVERS> AND OTHER<STRANGERS> WHEN > WE<MEET> JOE<BLACK> HAWK<DOWN> TO<YOU> CANT TAKE IT WITH<YOU> LIGHT UP > MY<LIFE> AS A<HOUSE><PARTY><MONSTER><HOUSE> PARTY<3> NINJAS KICK<BACK> > TO<SCHOOL> OF<ROCK><STAR> TREK IV THE VOYAGE<HOME><ALONE> IN > THE<DARK><CITY> OF<JOY><RIDE> THE HIGH<COUNTRY><LIFE> > IS<BEAUTIFUL><GIRLS> GIRLS<GIRLS> WILL BE<GIRLS> JUST WANT TO > HAVE<FUN> AND FANCY<FREE> WILLY 2 THE ADVENTURE<HOME> ALONE<3> NINJAS > KNUCKLE<UP> CLOSE AND<PERSONAL><BEST> OF THE<BEST><MEN> CRY<BULLETS> > OVER<BROADWAY> DANNY<ROSE><RED><EYE> FOR AN<EYE> OF<GOD> TOLD ME<TO> > DIE<FOR> YOUR EYES<ONLY> THE STRONG SURVIVE A CELEBRATION > OF<SOUL><FOOD> OF<LOVE> WALKED<IN> AND<OUT><COLD><FEVER><PITCH><BLACK> > LIKE<ME> WITHOUT<YOU> ONLY LIVE<ONCE> IN THE<LIFE> OR SOMETHING > LIKE<IT> HAPPENED AT THE WORLDS<FAIR><GAME> OF<DEATH> WISH V THE FACE > OF<DEATH><WISH> UPON A<STAR> TREK THE MOTION<PICTURE><BRIDE> > OF<FRANKENSTEIN> MEETS THE WOLF<MAN> ON<FIRE> IN THE<SKY><HIGH> > SCHOOL<HIGH><SPIRITS> OF THE<DEAD> OF<NIGHT><MOTHER><NIGHT> OF THE > LIVING<DEAD> MAN<WALKING> AND<TALKING> ABOUT<SEX> AND THE > OTHER<MAN><TROUBLE> EVERY<DAY> OF THE<WOMAN> ON<TOP><GUN><CRAZY> > AS<HELL> UP IN<HARLEM> RIVER<DRIVE> ME<CRAZY><PEOPLE> > WILL<TALK><RADIO><DAYS> OF THUNDER > Saw that one in the theaters. Save your money and wait for the torrent. TwP |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
Rick DeNatale wrote:
> Okay, I've been playing with doing this programatically. I'm not yet > convinced that it's bug free, but I've found some rather long movie > title chains using the non-imdb list posted earlier: > > 10 ITEMS OR<LESS> THAN<ZERO><DAY> FOR<NIGHT> AND<DAY> OF <snip> > WILL<TALK><RADIO><DAYS> OF THUNDER Why didn't it find Thunder Road[1]? (And then maybe Road Trip to Bountiful...) [1] http://us.imdb.com/title/tt0052293/ -- vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407 |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
> > Okay, I've been playing with doing this programatically. I'm not yet
> > convinced that it's bug free, but I've found some rather long movie > > title chains using the non-imdb list posted earlier: code! > Why didn't it find Thunder Road[1]? (And then maybe Road Trip to > Bountiful...) > > [1] http://us.imdb.com/title/tt0052293/ This is actually a very interesting question. I think the answer is probably "Travelling Salesman Problem." -- Giles Bowkett Podcast: http://hollywoodgrit.blogspot.com Blog: http://gilesbowkett.blogspot.com Portfolio: http://www.gilesgoatboy.org Tumblelog: http://giles.tumblr.com |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
Giles Bowkett wrote:
>> Why didn't it find Thunder Road[1]? (And then maybe Road Trip to >> Bountiful...) >> >> [1] http://us.imdb.com/title/tt0052293/ > > This is actually a very interesting question. I think the answer is > probably "Travelling Salesman Problem." This is not an optimization problem (not as stated so far, anyway). There must be a bug somewhere--in IMDB, or in the script. -- vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407 |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
> >> Why didn't it find Thunder Road[1]? (And then maybe Road Trip to
> >> Bountiful...) > >> > >> [1] http://us.imdb.com/title/tt0052293/ > > > > This is actually a very interesting question. I think the answer is > > probably "Travelling Salesman Problem." > > This is not an optimization problem (not as stated so far, anyway). > > There must be a bug somewhere--in IMDB, or in the script. OK. But why it didn't find X combination when it did find Y combination, surely that's a complex question to answer. -- Giles Bowkett Podcast: http://hollywoodgrit.blogspot.com Blog: http://gilesbowkett.blogspot.com Portfolio: http://www.gilesgoatboy.org Tumblelog: http://giles.tumblr.com |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
[some sadly uncited person wrote]
>> > Okay, I've been playing with doing this programatically. I'm not >> > yet convinced that it's bug free, but I've found some rather long >> > movie title chains using the non-imdb list posted earlier: I'm surprised no-one's found a loop yet. Is there one? What's the shortest? Regards, Jeremy Henty |
|
|
|
#8 |
|
Messages: n/a
Hébergeur: |
Giles Bowkett wrote:
>>>> Why didn't it find Thunder Road[1]? (And then maybe Road Trip to >>>> Bountiful...) >>>> >>>> [1] http://us.imdb.com/title/tt0052293/ >>> This is actually a very interesting question. I think the answer is >>> probably "Travelling Salesman Problem." >> This is not an optimization problem (not as stated so far, anyway). >> >> There must be a bug somewhere--in IMDB, or in the script. > > OK. But why it didn't find X combination when it did find Y > combination, surely that's a complex question to answer. > Actually, what I overlooked is that Rick's original post said he used a "non-imdb list", and I found "Thunder Road" on imdb. Probably his list just doesn't include this one. -- vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407 |
|
|
|
#9 |
|
Messages: n/a
Hébergeur: |
On Jan 5, 2008 6:14 PM, Jeremy Henty <onepoint@starurchin.org> wrote:
> [some sadly uncited person wrote] > > >> > Okay, I've been playing with doing this programatically. I'm not > >> > yet convinced that it's bug free, but I've found some rather long > >> > movie title chains using the non-imdb list posted earlier: > > I'm surprised no-one's found a loop yet. Is there one? What's the > shortest? The shortest is easy: File.readlines('MOVIES.LST').find_all {|line| line.split.first == line.split.last} ["\n", "AREMEMBER\n", "AUTHOR AUTHOR\n", "BEST OF THE BEST\n", "BREAKER BREAKER\n", "BUDDY BUDDY\n", "CHUD II BUD THE CHUD\n", "CORRINA CORRINA\n", "DEATH WISH V THE FACE OF DEATH\n", "DIE MOMMIE DIE\n", "DIE MONSTER DIE\n", "DREAM A LITTLE DREAM\n", "EAST IS EAST\n", "EYE FOR AN EYE\n", "GIRLS GIRLS GIRLS\n", "GIRLS WILL BE GIRLS\n", "HIGH SCHOOL HIGH\n", "JULIA AND JULIA\n", "JUNGLE 2 JUNGLE\n", "KRAMER VS KRAMER\n", "LADYBIRD LADYBIRD\n", "LIAR LIAR\n", "MELINDA AND MELINDA\n", "MOMENT BY MOMENT\n", "MURDER AND MURDER\n", "NIAGARA NIAGARA\n", "NIGHTBREED \n", "SCREAM BLACULA SCREAM\n", "SISTER MY SISTER\n", "SUNDAY BLOODY SUNDAY\n", "THEGUEST\n", "THEY SHOOT HORSES DONT THEY\n", "TIME AFTER TIME\n", "TORA TORA TORA\n", "WRONGSTAND\n", "YI YI\n", "YOU CANT TAKE IT WITH YOU\n"] |
|
|
|
#10 |
|
Messages: n/a
Hébergeur: |
On 2008-01-06, Chris Shea <chris@ruby.tie-rack.org> wrote:
> On Jan 5, 2008 6:14 PM, Jeremy Henty <onepoint@starurchin.org> wrote: >> >> I'm surprised no-one's found a loop yet. Is there one? What's the >> shortest? > > The shortest is easy: > > File.readlines('MOVIES.LST').find_all {|line| line.split.first == > line.split.last} > ["\n", "AREMEMBER\n", "AUTHOR AUTHOR\n", "BEST OF THE BEST\n", > "BREAKER BREAKER\n", "BUDDY BUDDY\n", "CHUD II BUD THE CHUD\n", > "CORRINA CORRINA\n", "DEATH WISH V THE FACE OF DEATH\n", "DIE MOMMIE > DIE\n", "DIE MONSTER DIE\n", "DREAM A LITTLE DREAM\n", "EAST IS > EAST\n", "EYE FOR AN EYE\n", "GIRLS GIRLS GIRLS\n", "GIRLS WILL BE > GIRLS\n", "HIGH SCHOOL HIGH\n", "JULIA AND JULIA\n", "JUNGLE 2 > JUNGLE\n", "KRAMER VS KRAMER\n", "LADYBIRD LADYBIRD\n", "LIAR LIAR\n", > "MELINDA AND MELINDA\n", "MOMENT BY MOMENT\n", "MURDER AND MURDER\n", > "NIAGARA NIAGARA\n", "NIGHTBREED \n", "SCREAM BLACULA SCREAM\n", > "SISTER MY SISTER\n", "SUNDAY BLOODY SUNDAY\n", "THEGUEST\n", "THEY > SHOOT HORSES DONT THEY\n", "TIME AFTER TIME\n", "TORA TORA TORA\n", > "WRONGSTAND\n", "YI YI\n", "YOU CANT TAKE IT WITH YOU\n"] D'oh! Neat! Any twofers? (Nitpick: the code should eliminate single word titles.) Jeremy Henty |
|
|
|
#11 |
|
Messages: n/a
Hébergeur: |
> >> There must be a bug somewhere--in IMDB, or in the script.
> > > > OK. But why it didn't find X combination when it did find Y > > combination, surely that's a complex question to answer. > > Actually, what I overlooked is that Rick's original post said he used a > "non-imdb list", and I found "Thunder Road" on imdb. Probably his list > just doesn't include this one. Ah, OK. I was wondering. Yeah, Rick was using (I think) the official list from ITA. I posted it the other day. -- Giles Bowkett Podcast: http://hollywoodgrit.blogspot.com Blog: http://gilesbowkett.blogspot.com Portfolio: http://www.gilesgoatboy.org Tumblelog: http://giles.tumblr.com |
|
|
|
#12 |
|
Messages: n/a
Hébergeur: |
On Jan 5, 2008 5:05 PM, Joel VanderWerf <vjoel@path.berkeley.edu> wrote:
> Rick DeNatale wrote: > > Okay, I've been playing with doing this programatically. I'm not yet > > convinced that it's bug free, but I've found some rather long movie > > title chains using the non-imdb list posted earlier: > > > > 10 ITEMS OR<LESS> THAN<ZERO><DAY> FOR<NIGHT> AND<DAY> OF > <snip> > > WILL<TALK><RADIO><DAYS> OF THUNDER > > Why didn't it find Thunder Road[1]? (And then maybe Road Trip to > Bountiful...) Well first of all, because thunder road isn't on that list. Second, because it's the longest title the program found in the time I gave to run it. As I said the program isn't of the quality that I'd prefer to share, but what the heck. It's a pretty dumb search, and I have it print a chain only the first time it finds one longer (in terms of the number of movies) than any it had found before. It's full of "the simplest thing that could possibly work" decisions with the idea of making it work before trying to make it fast. require 'net/http' class Title @first_words = Hash.new {|h,k| h[k] = []} def self.register(title) @first_words[title.first_word] << title if title.first_word end def self.process_lines(string) string.each_line do |line| self.new(line) end end def self.generate_titles result = [] @first_words.keys.sort.each do |fw| all_starting_with(fw).each do |title| title.chains do | chain | result << chain end end end end def self.all_starting_with(word) @first_words[word] end def print_chain(chain) puts "#{chain.length}: #{chain.inject(chain.first.first_word) { |chained_title, title| title.chain_to(chained_title)}}" end def chain_to(string) "#{string.sub(/\s(\S+)$/,'<\1>')} #{@words[1...@words.length].join(' ')}" end def self.max_chain?(chain) @max ||= 0 if chain.length > @max @max = chain.length true else false end end def chains(chain = []) root_chain = chain << self raise "Duplicate title #{print_chain(root_chain)}" unless root_chain.length == root_chain.uniq.length print_chain(root_chain) if self.class.max_chain?(root_chain) result = [] self.class.all_starting_with(self.last_word).each do |title| unless root_chain.include?(title) title.chains(root_chain.dup).each do |chain| result << chain end end end result end def initialize(title_string) @words = title_string.strip.split self.class.register(self) end def first_word @words.first end def last_word @words.last end def to_s @words.join(' ') end def inspect to_s end end _,x = Net::HTTP.new("itafullsite.dev.neptuneweb.com").ge t("/careers/puzzles/MOVIES.LST") Title.process_lines(x) p Title.generate_titles -- Rick DeNatale My blog on Ruby http://talklikeaduck.denhaven2.com/ |
|
![]() |
| Outils de la discussion | |
|
|