You are here

Aljazeera Doc ini

31 posts / 0 new
Last post
msallal
Offline
Donator
Joined: 2 years
Last seen: 6 months
Aljazeera Doc ini

Hi

I am trying to write an ini for Aljazeera Documentary https://doc.aljazeera.net/%D8%AC%D8%AF%D9%88%D9%84-%D8%A7%D9%84%D8%A8%D8%AB where the service call under this URL https://doc.aljazeera.net/graphql?operationName=SchedulePageQuery&variab...

the problems here listed bellow:
1- the service is not JSON or XML, it is mainly html as is
2- the site support Arabic only
3- the date on this format الثلاثاء 31 أغسطس/آب 2021, which where I need help

do you have any idea how can I match the day with today parsing time

thanks

BeChan
Offline
BeChan's picture
Has donated long time ago
Joined: 4 years
Last seen: 2 months
msallal wrote:

Hi
I am trying to write an ini for Aljazeera Documentary https://doc.aljazeera.net/%D8%AC%D8%AF%D9%88%D9%84-%D8%A7%D9%84%D8%A8%D8%AB where the service call under this URL https://doc.aljazeera.net/graphql?operationName=SchedulePageQuery&variab...
the problems here listed bellow:
1- the service is not JSON or XML, it is mainly html as is
2- the site support Arabic only
3- the date on this format الثلاثاء 31 أغسطس/آب 2021, which where I need help
do you have any idea how can I match the day with today parsing time
thanks

try this one you have 2 arabic and 2 english EPG choose one after test

Attachments: 
mat8861
Offline
WG++ Team memberDonator
Joined: 8 years
Last seen: 15 hours
msallal wrote:

Hi
I am trying to write an ini for Aljazeera Documentary https://doc.aljazeera.net/%D8%AC%D8%AF%D9%88%D9%84-%D8%A7%D9%84%D8%A8%D8%AB where the service call under this URL https://doc.aljazeera.net/graphql?operationName=SchedulePageQuery&variab...
the problems here listed bellow:
1- the service is not JSON or XML, it is mainly html as is
2- the site support Arabic only
3- the date on this format الثلاثاء 31 أغسطس/آب 2021, which where I need help
do you have any idea how can I match the day with today parsing time
thanks

You need to use max=7.1 and firstday=0123456 then scrub time and title

msallal
Offline
Donator
Joined: 2 years
Last seen: 6 months
BeChan wrote:

msallal wrote:
Hi
I am trying to write an ini for Aljazeera Documentary https://doc.aljazeera.net/%D8%AC%D8%AF%D9%88%D9%84-%D8%A7%D9%84%D8%A8%D8%AB where the service call under this URL https://doc.aljazeera.net/graphql?operationName=SchedulePageQuery&variab...
the problems here listed bellow:
1- the service is not JSON or XML, it is mainly html as is
2- the site support Arabic only
3- the date on this format الثلاثاء 31 أغسطس/آب 2021, which where I need help
do you have any idea how can I match the day with today parsing time
thanks

try this one you have 2 arabic and 2 english EPG choose one after test

those are not accurate, that's why I am trying to write my own one

msallal
Offline
Donator
Joined: 2 years
Last seen: 6 months
mat8861 wrote:

msallal wrote:
Hi
I am trying to write an ini for Aljazeera Documentary https://doc.aljazeera.net/%D8%AC%D8%AF%D9%88%D9%84-%D8%A7%D9%84%D8%A8%D8%AB where the service call under this URL https://doc.aljazeera.net/graphql?operationName=SchedulePageQuery&variab...
the problems here listed bellow:
1- the service is not JSON or XML, it is mainly html as is
2- the site support Arabic only
3- the date on this format الثلاثاء 31 أغسطس/آب 2021, which where I need help
do you have any idea how can I match the day with today parsing time
thanks

You need to use max=7.1 and firstday=0123456 then scrub time and title

Unfortunately this is not working, the problem maybe with paring the Arabic date, I am not able to extract it correctly using regex.
check my post here
https://stackoverflow.com/questions/69232694/regex-extract-valid-arabic-...

mat8861
Offline
WG++ Team memberDonator
Joined: 8 years
Last seen: 15 hours

you don't need to set a date because it is all in one page. use daycounter=0 because the guide starts with the first day (today)

BeChan
Offline
BeChan's picture
Has donated long time ago
Joined: 4 years
Last seen: 2 months

is better to use Al Jazeera Documentary from bein entertainment

mat8861
Offline
WG++ Team memberDonator
Joined: 8 years
Last seen: 15 hours
BeChan wrote:

is better to use Al Jazeera Documentary from bein entertainment

this guy is trying to make a siteini working....we are in siteini developer.

@ msallal
it is very easy see attached

msallal
Offline
Donator
Joined: 2 years
Last seen: 6 months
mat8861 wrote:

you don't need to set a date because it is all in one page. use daycounter=0 because the guide starts with the first day (today)

No it is not, that is not updated daily, and started with today, it is updated 2-3 time a week only

BeChan
Offline
BeChan's picture
Has donated long time ago
Joined: 4 years
Last seen: 2 months

Al jazeera doc is one program of Al Jazeera news arabic and is documentary of Al Jazeera investigations means part Al Jazeera news channel, you can check the home page Al Jazeera Documentary is other channel
Mat8861 sorry to interfere but I need to help that all

msallal
Offline
Donator
Joined: 2 years
Last seen: 6 months

Aljazeera news is working fine, because it has JSON format and i already build my own ini for that, but the documantary one has only Html format, even from the https://doc.aljazeera.net/graphql?operationName=SchedulePageQuery&variab...
it only shows html and render those as is.

the problem with this site, it is not updated daily, which means the first date may not be today and the only way to match the day is by setting the date with today, also the date format is mixed السبت 18 سبتمبر/أيلول 2021

Attachments: 
mat8861
Offline
WG++ Team memberDonator
Joined: 8 years
Last seen: 15 hours

I did not monitor the updates, but if it is not updated daily is a problem. May be if you really need as it is you can set to run avery monday...anyway the explanation for site.ini to work i think is clear to you.

msallal
Offline
Donator
Joined: 2 years
Last seen: 6 months
mat8861 wrote:

I did not monitor the updates, but if it is not updated daily is a problem. May be if you really need as it is you can set to run avery monday...anyway the explanation for site.ini to work i think is clear to you.

yea, i understand the concept of .ini, but for this specific site the data update once ore twice per week, that mean the first day may not be Today most of the time, thats why i need to parse the string Arabic date الثلاثاء 31 أغسطس/آب 2021 to be dd-mm-yyyy match the day with Today

mat8861
Offline
WG++ Team memberDonator
Joined: 8 years
Last seen: 15 hours

Just checked and today shows 19 sep. so my siteini is ok(it's not firstday but daycounter=0).
I checked and
1. you cannot get the graphql page.
2. to get the date try urldate.format {datestring|ddd/dd/MMM/yyyy|ar-AE}

msallal
Offline
Donator
Joined: 2 years
Last seen: 6 months
mat8861 wrote:

Just checked and today shows 19 sep. so my siteini is ok(it's not firstday but daycounter=0).
I checked and
1. you cannot get the graphql page.
2. to get the date try urldate.format {datestring|ddd/dd/MMM/yyyy|ar-AE}

it works with you, because the first day was your current day, if you check the first day on site https://doc.aljazeera.net/%D8%AC%D8%AF%D9%88%D9%84-%D8%A7%D9%84%D8%A8%D8%AB shows Sunday, but today is Wednesday.

also urldate.format {datestring|ddd/dd/MMM/yyyy|ar-AE}
did not make any changes to the result because the date here is not standard الإثنين 20 سبتمبر/أيلول 2021

this equivalinat to ddd dd mon/mon(in another format) yyyy

mat8861
Offline
WG++ Team memberDonator
Joined: 8 years
Last seen: 15 hours

do not waste time this site will never be ok. My suggestion is every week you make a fixed siteini, which will be very easy as you have time and title.

msallal
Offline
Donator
Joined: 2 years
Last seen: 6 months
mat8861 wrote:

do not waste time this site will never be ok. My suggestion is every week you make a fixed siteini, which will be very easy as you have time and title.

I don't understand why, the data is correct and accurate, the only problem with the site is that it changes the data twice or three times a week.

the only thing we need is to match the start date with the date provided in the site with same format. I am trying with regex to extract a suitable date that can be easily grabbed

appreciate your advice

mat8861
Offline
WG++ Team memberDonator
Joined: 8 years
Last seen: 15 hours

have to find a solution for dates...i can get it one day correctly, i have to calculate a formula to include more the one day

msallal
Offline
Donator
Joined: 2 years
Last seen: 6 months

one or two days for me is fine, because i have docker run every day.
I appreciate if you can help do the right calculate here

mat8861
Offline
WG++ Team memberDonator
Joined: 8 years
Last seen: 15 hours

here you go if you run it every day after midnight will be ok. I will think about on how to get what is available.

Attachments: 
msallal
Offline
Donator
Joined: 2 years
Last seen: 6 months
mat8861 wrote:

here you go if you run it every day after midnight will be ok. I will think about on how to get what is available.

Thanks Mat. this is working for only one day, and i have to grab it multiple time in order to get the entire day, can we make for two days at least

for example can we calculate for today and tomorrow
scope.range{(urlindex)|end}
index_variable_element.modify {set|1.0}
index_variable_element.modify {calculate(format=timespan,days)}
index_temp_1.modify {calculate(format=date,ddd#dd#MMM)|'urldate'}
index_temp_1.modify{replace|#| }
index_temp_2.modify {calculate(format=date,ddd#dd#MMM)|'urldate' 'index_variable_element' +}
index_temp_2.modify{replace|#| }
url_index.modify {replace|##start_date##|'index_temp_1'}
url_index.modify {replace|##end_date##|'index_temp_2'}
end_scope

thanks

mat8861
Offline
WG++ Team memberDonator
Joined: 8 years
Last seen: 15 hours

you cannot...that may work in url_index but not with split_index. I will think about for a solution

msallal
Offline
Donator
Joined: 2 years
Last seen: 6 months

Hi Matt,
Did you find a way to handle this

Thanks

mat8861
Offline
WG++ Team memberDonator
Joined: 8 years
Last seen: 15 hours

Sorry i though i did post it. So it starts when it finds today date > end.

Attachments: 
msallal
Offline
Donator
Joined: 2 years
Last seen: 6 months
mat8861 wrote:

Sorry i though i did post it. So it starts when it finds today date > end.

You are amazing, Appreciated
I will test it for the entire week and will let you know
Thanks bro

msallal
Offline
Donator
Joined: 2 years
Last seen: 6 months

Hi Matt,

just for my curiosity, There is one issue with the .ini
usually the date coming in web like this الأحد 7 نوفمبر, and the urldate format looks for الأحد 07 نوفمبر with leading zero to the day number, so for the first 9 days in a month it will not match.

I ended up with changing the
index_variable_element.modify{calculate(format=date,ddd#dd#MMM)|'urldate'}
to
index_variable_element.modify{calculate(format=date,ddd#d)|'urldate'} to look only for the day name and first digit of the date like tis الأحد 7

it is working fine, but for my curiosity how can I remove the leading zero from the urldate
I checked the documentation to figure out how to removed the leading zero when, do you have any idea

Thanks

mat8861
Offline
WG++ Team memberDonator
Joined: 8 years
Last seen: 15 hours

sent you a pm

msallal
Offline
Donator
Joined: 2 years
Last seen: 6 months

replied

mat8861
Offline
WG++ Team memberDonator
Joined: 8 years
Last seen: 15 hours

@msallal once you test it please post it for the community.
Thanks

msallal
Offline
Donator
Joined: 2 years
Last seen: 6 months

sure,

I have another question, why some ini showing summary in logs and some are not.
[ Info ] Summary for update of channel name
[ Info ] missing shows added 0
[ Info ] changed shows updated 0
[ Info ] new shows added 72
[ Info ] unchanged shows inspected 0
[ Info ] total after update 72

I dont know what I am missing here to add those into logs

Thanks

mat8861
Offline
WG++ Team memberDonator
Joined: 8 years
Last seen: 15 hours

the solution was even simpler for doc aljazeera see attached.
For logs, if debug comes out as summary there is something not properly set somewhere, there could be lots of thinhs causing that, could even be a space or a date in subtitle for example. Basically it depends....also with mode in config
In your sample with force mode in config
[ Info ] Summary for update of الجزيرة الوثائقية
[ Info ] missing shows added 0
[ Info ] changed shows updated 0
[ Info ] new shows added 126
[ Info ] unchanged shows inspected 0
[ Info ] total after update 126

[ Info ] elapstime / updated show 0.00 seconds
[ Debug ]
[ Debug ] 126 shows in 1 channels
[ Debug ] 0 updated shows
[ Debug ] 126 new shows added
[ Info ]
[ Info ]
[ ] Job finished at 09/11/2021 19:00:32 done in 1s

with incremental mode:
[ Info ] Summary for update of الجزيرة الوثائقية
[ Info ] no changes, no update necessary !
[ Info ] unchanged shows inspected 126
[ Info ] total after update 126

[ Debug ]
[ Debug ] 126 shows in 1 channels
[ Debug ] 0 updated shows
[ Debug ] 0 new shows added
[ Info ]
[ Info ]
[ ] Job finished at 09/11/2021 18:58:58 done in 1s

Attachments: 
Log in or register to post comments

Brought to you by Jan van Straaten

Program Development - Jan van Straaten ------- Web design - Francis De Paemeleere
Supported by: servercare.nl