**------------------------------------------------------------------------------------------------ * @header_start * WebGrab+Plus ini for grabbing MDB data from TvGuide websites * @MinSWversion : V1.1.1/56.12 * @Site: thetvdb.com * @Revision 0 - [02/12/2015] Jan van Straaten * - creation * @Remarks: - series database , primarysearch with bing * @header_end **------------------------------------------------------------------------------------------------ * * site {url=thetvdb.com|cultureinfo=en-US|charset=UTF-8|matchfactor=70|searchsite=bing} * primary search: mdb_variable_element.modify {addstart|'x_title'} url_primarysearch {url|http://www.bing.com/search?q=thetvdb+####} url_primarysearch.headers {customheader=Accept-Encoding=gzip,deflate} *http://www.bing.com/search?q=thetvdb+Malcolm+in+the+Middle url_primarysearch.modify {replace|####|'mdb_variable_element'} url_primarysearch.modify {replace| |+} * if title or variable_element has spaces * followed by all the known episodes in a element * following the mdbconfig mustmatch="title,subtitle" the program looks for a match of the title (in the element) * and a match of the subtitle in one of the elements * * mdb elements: * in the top element: mdb_title.scrub {single|p1||||} mdb_title.scrub {single|p1||||} mdb_actor.scrub {multi(separator="!?!?!")|p1||||} mdb_actor.modify {cleanup} * * the subtitle is the episode title. The next scrub first results in all the subtitles for this series * but the matching routine will automatically replace it by the one that matches (highest matchfactor ) mdb_subtitle.scrub {multi|p1||||} * * the next step is to extract the matching element mdb_temp_6.scrub {multi|p1||||} * all episodes mdb_temp_1.modify {calculate('mdb_temp_6' not "" type=element format=F0)|'mdb_temp_6' 'mdb_subtitle' @} * index of the episode mdb_temp_1.modify {substring(type=element)|'mdb_temp_6' 'mdb_temp_1' 1} * the episode in xml format * * the other mdb elements mdb_temp_1.modify {replace|>!?!?!|>} mdb_temp_1.modify {replace|!?!?!<|<} mdb_temp_1.modify {replace|!?!?!|, } mdb_temp_1.modify {replace|\n|\|} * split in individual elements for easy separation * * episodeid * we combine (=episode id) and mdb_temp_2.modify {calculate(type=element format=F0)|'mdb_temp_1' "" @} * index of element mdb_episode_id.modify {substring('mdb_temp_2' not "-1" type=element)|'mdb_temp_1' 'mdb_temp_2' 1} mdb_episode_id.modify {remove|} mdb_episode_id.modify {remove|} mdb_temp_2.modify {calculate(type=element format=F0)|'mdb_temp_1' "" @} * index of element mdb_temp_3.modify {substring('mdb_temp_2' not "-1" type=element)|'mdb_temp_1' 'mdb_temp_2' 1} mdb_temp_3.modify {replace||SeasonId: } mdb_temp_3.modify {remove|} mdb_episode_id.modify {addend|'mdb_temp_3'} mdb_episode_id.modify {cleanup} * * episode-num * we combine and in SxEx mdb_temp_2.modify {calculate(type=element format=F0)|'mdb_temp_1' "" @} * index of element mdb_episode.modify {substring('mdb_temp_2' not "-1" type=element)|'mdb_temp_1' 'mdb_temp_2' 1} mdb_episode.modify {replace||E} mdb_episode.modify {remove|} mdb_temp_2.modify {calculate(type=element format=F0)|'mdb_temp_1' "" @} * index of element mdb_temp_3.modify {substring('mdb_temp_2' not "-1" type=element)|'mdb_temp_1' 'mdb_temp_2' 1} mdb_temp_3.modify {replace||S} mdb_temp_3.modify {remove|} mdb_episode.modify {addstart|'mdb_temp_3'} mdb_episode.modify {cleanup} * * description mdb_temp_2.modify {calculate(type=element format=F0)|'mdb_temp_1' "" @} * index of element mdb_description.modify {substring('mdb_temp_2' not "-1" type=element)|'mdb_temp_1' 'mdb_temp_2' 1} mdb_description.modify {remove|} mdb_description.modify {remove|} mdb_description.modify {cleanup} * * starrating mdb_temp_2.modify {calculate(type=element format=F0)|'mdb_temp_1' "" @} * index of element mdb_starrating.modify {substring('mdb_temp_2' not "-1" type=element)|'mdb_temp_1' 'mdb_temp_2' 1} mdb_starrating.modify {remove|} mdb_starrating.modify {remove|} mdb_starrating.modify {cleanup} * * Director mdb_temp_2.modify {calculate(type=element format=F0)|'mdb_temp_1' "" @} * index of element mdb_director.modify {substring('mdb_temp_2' not "-1" type=element)|'mdb_temp_1' 'mdb_temp_2' 1} mdb_director.modify {remove|} mdb_director.modify {remove|} mdb_director.modify {cleanup} * * Guest stars mdb_temp_2.modify {calculate(type=element format=F0)|'mdb_temp_1' "" @} * index of element mdb_temp_4.modify {clear} mdb_temp_4.modify {substring('mdb_temp_2' not "-1" type=element)|'mdb_temp_1' 'mdb_temp_2' 1} mdb_temp_4.modify {remove|} mdb_temp_4.modify {remove|} mdb_temp_4.modify {replace|, |\|} mdb_temp_4.modify {cleanup} mdb_temp_4.modify {addend(not "")| (guest)} mdb_temp_4.modify {replace|\||####} mdb_actor.modify {replace|\||####} mdb_actor.modify {addend|####'mdb_temp_4'} mdb_actor.modify {replace|####|\|} mdb_actor.modify {cleanup(removeduplicates=name,70)} mdb_actor.modify {cleanup} * * Episode Icon (showicon) mdb_temp_2.modify {calculate(type=element format=F0)|'mdb_temp_1' "" @} * index of element mdb_showicon.modify {substring('mdb_temp_2' not "-1" type=element)|'mdb_temp_1' 'mdb_temp_2' 1} mdb_showicon.modify {remove|} mdb_showicon.modify {remove|} mdb_showicon.modify {cleanup} mdb_showicon.modify {addstart(not "")|http://www.thetvdb.com/banners/} * * Category = Genre * not listed at each episode. Only once for the whole series at the top of p1 mdb_category.scrub {single|p1||||} mdb_category.modify {replace|!?!?!|\|}