Quote (rockonkenshin @ Jun 18 2014 01:41pm)
Did you write that or did you find that online? Because that code is really terrible. You shouldn't ever parse HTML with regex.
Does using regex on their feeds api count!
Plugin for my irc bot using cinch as the framework.
Code
require 'open-uri'
class YoutubeChannelParser
include Cinch::Plugin
listen_to :channel
def listen(m)
return unless m.message =~ /youtube.com\/watch\?/ or m.message =~ /youtu.be/
title, author, duration, date, views = parseUrl(URI.extract(m.message, ['http', 'https']).first)
m.reply Format("%s%s%s %s: #{title}, %s: #{author}, %s: #{Time.at(duration.to_i).gmtime.strftime('%R:%S')}, %s: #{date}, %s: #{views}" % [Format(:bold, "["),
Format(:red, "YouTube"),
Format(:bold, "]"),
Format(:bold, "Title"),
Format(:bold, "Author"),
Format(:bold, "Duration"),
Format(:bold, "Date Added"),
Format(:bold, "Views")])
end
def parseUrl(url)
if url.include? '&'
p youtubeID = url[url.index('v=')+2..url.index('&', url.index('v=')+2)-1] if url =~ /youtube.com\/watch\?/
else
youtubeID = url[url.index('v=')+2..-1] if url =~ /youtube.com\/watch\?/
youtubeID = url[url.rindex('/')+1..-1] if url =~ /youtu.be/
end
return unless youtubeID
data = ''
open("http://gdata.youtube.com/feeds/api/videos/#{youtubeID}") { |file| data = file.read }
title = data.match(/<title type='text'>(.+)<\/title>/)[1]
author = data.match(/<author><name>(.+)<\/name>/)[1]
duration = data.match(/<yt:duration seconds='(\d+)'\/>/)[1]
date = data.match(/<published>(.+)<\/published>/)[1]
views = data.match(/viewCount='(.+)'\/>/)[1]
return title, author, duration, date[0..date.index('T')-1], views
end
end