**------------------------------------------------------------------------------------------------
* @header_start
* WebGrab+Plus ini for grabbing IMDB data from TvGuide websites
* @MinSWversion: V2.1.5
* - (postprocess V2.0.6)
* @Site: imdb.com
* @Revision 2 - [23/09/2017] Jan van Straaten
* - fix of incomplete episodenumlist and improved mdb_episode
* - added overall starrating in case of unrated episode
* - added title on the home page
* @Revision 1 - [15/06/2016] Jan van Straaten
* - support for match on episode-num or sub-title, use of new scopes
* - added mdbinitype
* @Revision 0 - [13/02/2016] Jan van Straaten
* - Bing as primary search, based on imdb.com.imdb.ini rev 6
* @Remarks: Series data extraction. English version.
* Bing improves matching especially for foreign language titles.
* @header_end
**------------------------------------------------------------------------------------------------
*
site {url=imdb.com|mdbinitype=serie|cultureinfo=en-GB|charset=UTF-8|matchfactor=70|searchsite=imdb|episodesystem=onscreen}}
scope.range {(primarysearch)|end}
* primary search (using bing):
url_primarysearch {url()|https://www.bing.com/search?q=IMDb+|'title'|}
https://www.bing.com/search?q=IMDb+Unge+kommissarie+Morse
url_primarysearch.modify {replace| |+}
*http://www.imdb.com/search/title?title=Touched%20by%20an%20Angel&title_type=tv_series
url_primarysearch.headers {customheader=Accept-Encoding=gzip,deflate}
mdb_show_id.scrub {regex|primary||||}
* episode-num
mdb_episodenumlist.scrub {regex(pattern="'S1'.'E1'")|p1||
\s+?(.+?)\s? | ||}
* Unrated Episodes:
*
* 5.2 |
* Mannsteufel |
*
*
mdb_title.scrub {single(separator="(" include=first exclude="")|p1|} * the original title
mdb_title.scrub {single(separator="(" include=first)|p1||||}
mdb_title.scrub {single(separator="(" include=first))|p7|} * the title on the 'home' page
mdb_title.modify {cleanup(tags="/=\"")} * removes starting "
mdb_title.modify {cleanup(tags="\"=/")}
*
** aka's not yet implemented if at all possible
*mdb_title.scrub {multi(separator=" - ")|p3||\n| | |} *aka's
*mdb_title.scrub {multi|p3||
\n| | |} *aka's
*
mdb_temp_6.scrub {regex()|p1||
\s+?(||} * all the episodes *
end_scope
*
scope.range {(getelements)|end}
* in case of matched subtitle
mdb_temp_1.modify {calculate('mdb_episodetitlelist' not "" type=element format=F0)|'mdb_episodetitlelist' 'mdb_subtitle' @} * index of the episode
* in case of matched episodenum
mdb_temp_1.modify {calculate('mdb_episodenumlist' not "" type=element format=F0)|'mdb_episodenumlist' 'mdb_episode' @} * index of the episode
mdb_temp_1.modify {substring(type=element)|'mdb_temp_6' 'mdb_temp_1' 1} * the episode in xml format
*
* elements from mdb_temp_1 (the episode)
** get the mdb_episode_id
mdb_episode_id.modify {substring(type=regex)|'mdb_temp_1' "(\d\{7\})/\">"} * get the tt nbr for the episode
*
* the following elements are taken from the episode detail page mdb-p2
* there is a story line, director and actors , starrating, episodenum
* also a 'full synopsys' on a separate page mdb-p3 ??
*
** full productiondate
mdb_productiondate.scrub {single()|p2|}
mdb_productiondate.modify {calculate(format=productiondate)} * only year allowed!
mdb_category.scrub {regex()|p2||(.+?)||}
mdb_actor.scrub {multi()|p4|?ref_=ttfc_fc_cl_t|itemprop="name">||}
mdb_director.scrub {multi|p4|?ref_=ttfc_fc_dr|" >|| | }
mdb_starrating.scrub {single()|p2||itemprop="ratingValue">||
}
mdb_starratingvotes.scrub {single|p2||based on|user ratings|
}
mdb_temp_2.scrub {single()|p7|}
mdb_description.scrub {regex|p2||||}
*
* subtitle when not already done with episodetitlelist
mdb_subtitle.modify {substring("" type=regex)|'mdb_temp_1' "(.+?) | "}
* episode must be last because it is used to get mdb_temp_1 (the actual episode data from mdb_temp_6)
mdb_episode.modify {clear}
loop {('mdb_episode' "" max=1)|end}
mdb_episode.modify {substring(type=regex)|'mdb_temp_1' "(\d*?\|\d*?\.\d*?)\s? | "}
mdb_episode.modify {substring(type=element)|0 1}
mdb_temp_3.modify {substring(type=regex)|'mdb_episode' "\.(\d*)"} * episode part
mdb_episode.modify {substring(type=regex)|'mdb_episode' "(\d*)\."} * the season part
* onsceen format
mdb_episode.modify {addstart(not "")|S}
mdb_episode.modify {addend('mdb_temp_3' not "")|E'mdb_temp_3'}
* convert to xmltv_ns
*mdb_temp_3.modify {calculate(not "" format=F0)|1 -}
*mdb_episode.modify {substring(type=regex)|'mdb_episode' "(\d*)\."} * the season part
*mdb_episode.modify {calculate(not "" format=F0)|1 -}
*mdb_episode.modify {addend()|.'mdb_temp_3'.}
end_loop
end_scope