**------------------------------------------------------------------------------------------------ * @header_start * WebGrab+Plus ini for grabbing EPG data from TvGuide websites * @Site: gatotv.com * @MinSWversion: * @Revision 4 - [25/07/2016] IvanRF * - date replace fix * @Revision 3 - [22/04/2016] 1NSdbZVbpZDX * - episode fix, added show details * @Revision 2 - [05/01/2016] Francis De Paemeleere * - remove debug content * @Revision 1 - [26/02/2015] Francis De Paemeleere * - remove new lines from title, ... * @Revision 0 - [17/02/2015] Francis De Paemeleere * - creation * @Remarks: currently index only (not much details on detail pages) * @header_end **------------------------------------------------------------------------------------------------ site {url=gatotv.com|timezone=America/Mexico_City|maxdays=7|cultureinfo=es-MX|charset=UTF-8|titlematchfactor=90|nopageoverlaps|firstshow=1} urldate.format {datestring|yyyy-MM-dd} url_index{url|http://www.gatotv.com/canal/|channel|/|urldate|} url_index.headers {customheader=Accept-Encoding=gzip,deflate} * to speedup the downloading of the index pages index_showsplit.scrub {multi |tbl_EPG_row|>|]*>.*?.*?class="tbl_EPG_TimesColumn"[^>]*>(.*?)||} index_start.modify {replace|a.m.|a. m.|am} index_start.modify {replace|p.m.|p. m.|pm} index_stop.modify {replace|a.m.|a. m.|am} index_stop.modify {replace|p.m.|p. m.|pm} index_title.scrub {regex||^.*?class="div_program_title_on_channel"[^>]*>(.*?)||} index_subtitle.scrub {regex||^.*?class="div_episode_programa_on_channel"[^>]*>(.*?)||} index_showicon.scrub {regex||^.*?src=\"([^\"]*)\"||} index_urlchannellogo.scrub {regex||^.*?class=\"div_MainPicture\"[^>]*>\s*]*>[^/]*src=\"([^\"]*)\"||} index_subtitle.modify {replace |Temporada |S} index_subtitle.modify {replace |Episodio |Ep} index_episode.scrub {single |div_episode_programa_on_channel">|||} index_episode.modify {replace |Temporada |S} index_episode.modify {replace |Episodio |Ep} *index_temp_1.modify {substring(type=regex)|'index_subtitle' "\s*Temporada\s*(\d*)\s*"} *index_temp_2.modify {substring(type=regex)|'index_subtitle' "\s*Episodio\s*(\d*)\s*"} *index_subtitle.modify {remove(type=regex)|"\s*Temporada\s*\d*\s*"} *index_subtitle.modify {remove(type=regex)|"\s*Episodio\s*\d*\s*"} *index_subtitle.modify {remove(type=regex)|"<[^>]*>"} *index_subtitle.modify {remove(type=regex)|"^\s*\!\?\!\?\!\s*$"} *| is converted to !?!?! by WG++ *index_subtitle.modify {remove(type=regex)|"^\s*$"} * index_temp_1 = season * index_temp_2 = episode *index_temp_1.modify {calculate(not="" format=F0)|1 -} *index_temp_2.modify {calculate(not="" format=F0)|1 -} *index_episode.modify {clear} *index_episode.modify {addend('index_temp_1' not="")|'index_temp_1'} *index_episode.modify {addend|.} *index_episode.modify {addend('index_temp_2' not="")|'index_temp_2'} *index_episode.modify {addend|.} *index_episode.modify {clear(="..")} index_start.modify {remove(type=regex)|<[^>]*>} index_stop.modify {remove(type=regex)|<[^>]*>} index_subtitle.modify {remove(type=regex)|<[^>]*>} index_title.modify {remove(type=regex)|<[^>]*>} index_subtitle.modify {remove(type=regex)|<[^>]*>} index_subtitle.modify {cleanup} index_title.modify {cleanup} index_subtitle.modify {cleanup} scope.range {(indexshowdetails)|end} index_urlshow.headers {customheader=Accept-Encoding=gzip,deflate} * to speedup the downloading of the detail pages index_urlshow {url |||||} description.scrub {single|"summary">|"description">||} actor.scrub {multi |itemprop="actors"|itemprop="name">||||} category.scrub {multi |Tipo:|nowrap" >||} country.scrub {multi |Paises de Origen:|nowrap" >||} producer.scrub {multi |itemprop="producer"|nowrap" >||} composer.scrub {multi |Música:|nowrap" >||} rating.scrub {single |Clasificación:|nowrap" >||} writer.scrub {single |Basado en:|nowrap" >||} titleoriginal.scrub {single |Título en Idioma Original:|nowrap" >||} end_scope ** _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ** ##### CHANNEL FILE CREATION (only to create the xxx-channel.xml file) ** ** @auto_xml_channel_start *url_index{url|http://www.gatotv.com/guia_tv/completa} *index_site_id.scrub {regex||class=\"\s*tbl_EPG_row[^>]*>(.*?)||} *scope.range {(channellist)|end} *index_site_channel.modify {addstart|'index_site_id'} *index_site_id.modify {substring(type=regex)|]*.*?href=\"([^\"]*)\"[^>]*title=\"[^\"]*\"[^>]*>} *index_site_id.modify {remove|http://www.gatotv.com/canal/} *index_site_channel.modify {substring(type=regex)|]*.*?href=\"[^\"]*\"[^>]*title=\"([^\"]*)\"[^>]*>} *index_site_id.modify {cleanup(removeduplicates=equal,100 link="index_site_channel")} *end_scope ** @auto_xml_channel_end