|
Post by Rod on Feb 22, 2020 8:00:54 GMT -5
If I run the simple extract from your code I get 53918 bytes of html without error which appears to be complete.
a$ = httpGet$("http://rosettacode.org/wiki/Category:Liberty_BASIC") ' get RB tasks from [RC] print len(a$) print a$
It starts with "<!DOCTYPE html>" It ends with "</html>"
Not quite sure how the rest of the content is structured but it seems to list all the task, not just those we have completed.
|
|
|
Post by Rod on Feb 22, 2020 8:07:25 GMT -5
Ahh... you both mention time to retrieve page. I am on Fibre and the download takes less than a second.
|
|
|
Post by Rod on Feb 22, 2020 8:58:07 GMT -5
Ok, the first line chokes with what looks like a 400 page not found error. The second gets the data but it is everyone's code. We would need to parse out Liberty which is probably possible.
I assume we are going to provide a short "list" clicking on that list will grab the code with a second httpget$() call
Either line works in my browser, the first get the Liberty code in view but all other examples are there too.
a$=httpget$("http://rosettacode.org/wiki/15_Puzzle_Game#Liberty_BASIC") print len(a$) print a$ a$=httpget$("http://rosettacode.org/wiki/15_Puzzle_Game") print len(a$) print a$
|
|
|
Post by metro on Feb 22, 2020 9:53:20 GMT -5
Ok, the first line chokes with what looks like a 400 page not found error. The second gets the data but it is everyone's code. We would need to parse out Liberty which is probably possible. I assume we are going to provide a short "list" clicking on that list will grab the code with a second httpget$() call Either line works in my browser, the first get the Liberty code in view but all other examples are there too. Rod, I have managed to parse out just the Liberty code with my last post. what I am struggling to understand is... yes I can post the url's you have supplied into my browser and it will go to the correct page.
BUT... when running the code below urlfile$ is downloaded to RosettaPage2.txt and a$ shows <title>400 Bad Request</title> on my machine (Mint Linux Wine) so in summary httpget$ fails but DownloadToFile$ actually gets the correct info. Hope that makes sense.(same link)
In order to parse out the Liberty specific code, the correct "section" needs to be selected from the downloaded page.
httpget$ works for all other requests just not for "http://rosettacode.org/wiki/15_Puzzle_Game#Liberty_BASIC" or any other code example
urlfile$= "http://rosettacode.org/wiki/15_Puzzle_Game#Liberty_BASIC" a$=httpget$(urlfile$) print a$
localfile$="RosettaPage2.txt" result = DownloadToFile(urlfile$, localfile$)
end
Function DownloadToFile(urlfile$, localfile$) print urlfile$ open "URLmon" for dll as #url calldll #url, "URLDownloadToFileA",_ 0 as long,_ 'null urlfile$ as ptr,_ 'url to download localfile$ as ptr,_ 'save file name 0 as long,_ 'reserved, must be 0 0 as long,_ 'callback address, can be 0 DownloadToFile as ulong '0=success close #url End Function
|
|
|
Post by Rod on Feb 22, 2020 11:17:23 GMT -5
Well after 30 seconds of google it appears we should not be sending #Liberty_Basic because the server does not expect it. Its for local use. The fetching software uses it to scroll the page once the entire page is received.
So we will need to use httpGet$() differently and get the root page and then extract the Liberty code. urlmon must be following protocol and not actually sending the #Liberty_Basic it is using that fragment identifier to scroll the local copy of the response?
Easy to mimic?
|
|
|
Post by gidiom2 on Feb 22, 2020 12:01:23 GMT -5
urlfile$= "http://rosettacode.org/wiki/15_Puzzle_Game?=Liberty_BASIC" might do it. Not sure if the info is complete, don't have time to check the returned html today.
|
|
|
Post by Rod on Feb 22, 2020 14:18:42 GMT -5
No, that's just getting a dummy response, same as changing # to : it appears to work but we are not getting the data. We need to stop adding the fragment description.
|
|
|
Post by metro on Feb 22, 2020 14:49:01 GMT -5
I'm a bit slow on the uptake sometimes Thanks Rod
|
|
|
Post by meerkat on Feb 22, 2020 17:52:13 GMT -5
Hi Laurie! You should move to California. I thought we had fires and then rain to wash out all the roads. But you guys, made us look like amateurs. Hope you still have your house..
I had a program in RB that cross referenced all the BASIC type languages to the tasks. Not sure anyone is interested, but decided to post it since it may be of some help.
I did a little conversion and it runs is LB. The RB code output to HTML and you could select any language from any task and it would show you the code. Looked like a lot of work to add that selection so I took it out. I can put it back if anyone is interested. It only takes a couple htmlGet's to get the code..
Nothing fancy.. Just a list..
a$ = httpGet$("http://rosettacode.org/mw/index.php?title=Special%3ASearch&search=basic&fulltext=Search") a$ = word$(a$,2,"Page title matches") 'print a$ print "==================== Basic Languages ==============================" numLang = 0 numLang = 0
s$ = "<li><div class='mw-search-result-heading'><a href=";chr$(34);"/wiki/" ls = len(s$) i = 3 i = 3 while i > 0 i = instr(a$,s$,i+5) + ls j = instr(a$,chr$(34),i+1) a3$ = mid$(a$,i,j-i) if a3$ = "Exceptions" then exit while if instr(a3$,"****") > 0 then goto [skipIt] if left$(a3$,5) = "BASIC" then goto [skipIt] if instr(a3$,"/") > 0 then goto [skipIt] if instr(a3$,"(") > 0 then goto [skipIt] if instr(a3$,"_string") > 0 then goto [skipIt] numLang = numLang + 1 lang$ = lang$ + a3$ + "|" print numLang,a3$ [skipIt] wend
' -------------------------------- ' find all task ' -------------------------------- print "============= Tasks ================" a$ = httpGet$("http://rosettacode.org/wiki/Category:Programming_Tasks") i = instr(a$,"<h2>Pages in category ";chr$(34);"Programming Tasks";chr$(34);"</h2>") a$ = mid$(a$,i+40) i = instr(a$,"The following") x$ = mid$(a$,i+14,20) x$ = left$(x$,6) i = instr(x$,",") if i > 0 then x$ = left$(x$,i-1) + mid$(x$,i+1) numTask = val(x$)
dim tasks$(numTask,numLang)
print "============= find task for each language ================" i = 0 srch$ = "<li><a href=";chr$(34);"/wiki/" ln = len(srch$) for nt = 1 to numTask i = instr(a$,srch$,i+1) if i = 0 then exit for x$ = mid$(a$,i+ln,100) thisTask$ = word$(x$,1,chr$(34)) ' print "thisTask:";thisTask$ tasks$(nt,0) = thisTask$ next nt
' ----------------------------------------- ' Find all task each language does ' ---------------------------------------- for i = 1 to numLang thisLang$ = word$(lang$,i,"|") print "Finding Task for language:";thisLang$ a$ = httpGet$("http://rosettacode.org/wiki/Category:";thisLang$) for j = 1 to numTask if instr(a$,tasks$(j,0)) > 0 then tasks$(j,i) = "X" else tasks$(j,i) = "-" end if next j next i for i = 1 to numTask thisTask$ = tasks$(i,0)+" -------------------------------------------------" print using("####",i);" ";left$(thisTask$,40); for j = 1 to numLang print " ";tasks$(i,j); next j print next i end
Hope this helps!! Dan
|
|
|
Post by metro on Feb 22, 2020 19:18:12 GMT -5
G'Day Dan, translates (Howdy)
We have been pretty lucky our end (2,500 miles from the fires) but have had a couple of small fires that closed our main freeway a few times. I'm very sorry that you lost three of your countrymen over here, whilst they battled the fires. I have just found out you had to outrun fires your end when you got evacuated, glad to see you're still with us.
Thanks for sharing the code.
Laurie Y'all have a great day
|
|
|
Post by gidiom2 on Feb 23, 2020 5:55:56 GMT -5
No, that's just getting a dummy response, same as changing # to : it appears to work but we are not getting the data. We need to stop adding the fragment description. The Puzzle page being used as an example does actually contain the LB code for this category (along with all other languages) but would be a pain to parse. As far as I can see this is the only place the code resides and httpGet$ only seems to get a complete file so parsing seems to be imperative. Save the downloaded page as html and open it in a browser to see the codes in readable form. I don't see any method other than parsing each category file to get the code. Someone may have a eureka moment.
|
|
|
Post by metro on Feb 23, 2020 8:39:33 GMT -5
No eureka moments down here just accepting the pain in the parse I have had some success with Sqlite but as my experience is limited I'm back for more sage advice. How do I insert a multiple lines into a field (ie new lines with CRLF or just LF) looking to insert the downloaded code
Nothing is working but is also not throwing an error...could be me though! Thanks in advance EDIT FIXED ERROR with a2$
EDIT 2 seems Sqlite doesn't like chr$(10) so to save downloaded code I have replaced chr$(10) with chr$(11) which I will rectify when extracting data. to quote a famous American..........."I'm not a smart man"....so if there is a better way...."Please Explain" ( to quote a not so famous Australian) I also double quoted the double quotes not sure yet if that is necessary...testing You will have to save the code below to a folder that includes the Dll
Dquote$=chr$(34)+chr$(34) DIM LineItem$(1000): dim d$(2) : dim row$(20) ' global DB$,hDB , row$ ,SQL$,urlfile$,Dquote$ GOSUB [setStruct]
CALL OPENDLL DB$= "Rosetta.db3"
dim info$(10, 10) if fileExists(DefaultDir$ , DB$) then CALL CONNECTDB DB$ else hDB=createDB(DB$) CALL CONNECTDB DB$ GOSUB [CREATETABLE] end if
a2$="http://rosettacode.org/mw/index.php?title=100_doors&action=edit§ion=187" Code2Save$= GETCODE$(a2$) print Code2Save$ Code2Save$= strRep$(Code2Save$,chr$(34),Dquote$) Code2Save$= strRep$(Code2Save$,chr$(10),Chr$(11))
Lang$="Liberty Basic"
CALL INSERTRECORDS Lang$,a2$,Code2Save$ CALL DISPLAYRESULT
close #sq3 print
print "***** Finished ***********" WAIT
[CREATETABLE] SQL$ = "CREATE TABLE Snippets(Lang,Title,Code Blob)" CALL EXECUTESQL SQL$ RETURN
SUB EXECUTESQL SQL$ calldll #sq3, "SQ3_4_LB_Execute",SQL$ as ptr, hDB as long,result as long IF result = 0 THEN calldll #sq3, "SQ3_4_LB_GetLastMessage", result as long 'print Winstring(result) END IF END SUB
[setStruct] Struct RS,_ BOF as long,_ ' is True when CurrPos = 1 EOF as long,_ ' is True when CurrPos = Rows Handle as long,_ ' address of recordset data returned by sqlite3.dll (dont't use) Rows as long,_ ' number of rows in recordset Cols as long,_ ' number of columns in recordset CurrPos as long,_ ' current row in recordset StrAdr as long ' address of data item (pointer to a string) RETURN
SUB OPENDLL Open "SQ3_4_LB.dll" for DLL As #sq3 calldll #sq3, "SQ3_4_LB_GetLastMessage", result as long msg$ = Winstring(result) ' LastMessage would be "General Error - Couldn't open 'sqlite3.dll'" if instr(msg$,"Error") then notice msg$ : close #sq3 ' close sqlite and end program end end if END SUB
SUB CONNECTDB DB$ calldll #sq3, "SQ3_4_LB_OpenDB",DB$ as ptr, overwrite as long, hDB as long ' see if it exist If hDB = 0 then calldll #sq3, "SQ3_4_LB_GetLastMessage", result as long connectDB$ = Winstring(result) notice connectDB$ end if END SUB
SUB INSERTRECORDS Lang$,a2$,Code2Save$ SQL$ = "INSERT INTO Snippets VALUES ('";Lang$;"','";a2$;"','";Code2Save$;"')" print SQL$ calldll #sq3, "SQ3_4_LB_GetLastMessage", result as long msg$ = Winstring(result) IF instr(msg$,"Error") THEN print msg$ CALL EXECUTESQL SQL$ END SUB
SUB DISPLAYRESULT SQL$ = "SELECT * FROM Snippets" calldll #sq3, "SQ3_4_LB_GetRecordset", SQL$ as ptr,hDB as long, RS as struct, result as long calldll #sq3, "SQ3_4_LB_GetLastMessage", result as long msg$ = Winstring(result)
IF instr(msg$,"Error") THEN notice msg$
IF RS.Rows.struct THEN PRINT str$(RS.Rows.struct); " Records Found and sorted by a5 : ":print
FOR n = 1 TO RS.Rows.struct CALLDLL #sq3, "SQ3_4_LB_RecordsetMoveToRow",n AS long, result AS void CALLDLL #sq3, "SQ3_4_LB_GetRecordsetValueOfRow", "|" AS ptr,result AS long IF result THEN RecFound$(n)= Winstring(RS.StrAdr.struct) print RecFound$(n) end if
CALLDLL #sq3, "SQ3_4_LB_RecordsetMoveNext", result AS void NEXT n END IF CALLDLL #sq3, "SQ3_4_LB_ReleaseRecordset", result AS void END SUB
FUNCTION GETCODE$(a2$) Url2get$=a2$ print Url2get$ result$=httpget$(Url2get$) result$=after$(result$,"lang lb>") result$=upto$(result$,"</lang>") GETCODE$=result$ END FUNCTION
' string replace rep str with FUNCTION strRep$(str$,rep$,with$) ln = len(rep$) ln1 = ln - 1 i = 1 while i <= len(str$) if mid$(str$,i,ln) = rep$ then strRep$ = strRep$ + with$ i = i + ln1 else strRep$ = strRep$ + mid$(str$,i,1) end if i = i + 1 WEND END FUNCTION
Function DownloadToFile(urlfile$, localfile$) print urlfile$ open "URLmon" for dll as #url calldll #url, "URLDownloadToFileA",_ 0 as long,_ 'null urlfile$ as ptr,_ 'url to download localfile$ as ptr,_ 'save file name 0 as long,_ 'reserved, must be 0 0 as long,_ 'callback address, can be 0 DownloadToFile as ulong '0=success close #url End Function
function fileExists(path$, filename$) 'dimension the array info$( at the beginning of your program files path$, filename$, info$() fileExists = val(info$(0, 0)) 'non zero is true end function
FUNCTION createDB(DB$) calldll #sq3, "SQ3_4_LB_CreateDB",DB$ as ptr, overwrite as long, hDB as ulong calldll #sq3, "SQ3_4_LB_GetLastMessage", result as long 'message for testing If hDB = 0 then calldll #sq3, "SQ3_4_LB_GetLastMessage", result as long end if createDB=hDB END FUNCTION
|
|
|
Post by Rod on Feb 23, 2020 9:02:40 GMT -5
gidiom2 Ok, but I am still thinking that these fragments on the tail end of the url are really not meant to be sent. My browser handles it well. It probably does not send the #liberty_basic but uses that once it has the whole page response. So with httpget$() we can send extra characters on the tail end of the url, but there is a risk that will cause the server to complain, other times it just seems to ignore the extra characters.
|
|
|
Post by gidiom2 on Feb 23, 2020 12:05:36 GMT -5
Rod, I forgot about the query of the url whilst concentrating on how to get the code. The tail end of the url is probably ineffective, I'll look at it again later but it remains that parsing each category is the only apparent way. (as metro said just accepting the pain in the parse )
|
|
|
Post by metro on Feb 23, 2020 23:50:46 GMT -5
Needs a lot more polishing and ultimately will should have database storage. and access to all languages and a better GUI
'Form created with the help of Freeform 3 v07-15-08 'Generated on Feb 24, 2020 at 10:57:56 global urlfile$,path$,filename$, TaskList$,Wlang$,alltext$
CR$= chr$(13): LF$=chr$(10): EOL$= CR$+LF$ ' make them global global CR$,LF$,EOL$ TaskList$="tasklist.txt" [setup.main.Window]
'-----Begin code for #main Dim allTasks$(2000): Dim allLangs$(1000) nomainwin WindowWidth = 700 WindowHeight = 500 UpperLeftX=int((DisplayWidth-WindowWidth)/2) UpperLeftY=int((DisplayHeight-WindowHeight)/2)
'-----Begin GUI objects code
TexteditorColor$ = "white" texteditor #main.textedit4, 160, 180, 515, 200 ListboxColor$ = "white"
listbox #main.listbox1, allTasks$(), [listbox1DoubleClick], 350, 40, 320, 100 listbox #main.combo1 allLangs$(), listbox2DoubleClick, 155, 40, 180, 100 ' Combobox #main.combo1 allLangs$(), [listbox2DoubleClick], 155, 40, 180, 100 ' button #main.button2,"Update tasks",[button2Click], UL, 5, 65, 120, 25 button #main.button3,"Update Languages",GETLANGS, UL, 5, 40, 120, 25 statictext #main.statictext1, "Language List", 200, 20, 180, 20 statictext #main.statictext2, "Task List", 455, 20, 200, 20 statictext #main.statictext3, "Download Language List First", 155, 150, 180, 20 '-----End GUI objects code
menu #main, "Edit" ' <-- Texteditor menu.
open "Liberty Basic Rosetta Tasks" for window as #main print #main, "font ms_sans_serif 10" print #main, "trapclose [quit.main]"
[main.inputLoop] 'wait here for input event 'CALL GETLANGS handle$
WAIT
SUB MyPopup Wtask$
WindowWidth = 380 WindowHeight = 110 UpperLeftX=int((DisplayWidth-WindowWidth)/2) UpperLeftY=int((DisplayHeight-WindowHeight)/2) Statictext #dlg2.txt, "Please wait while "+ Wtask$ +" is downloaded", 10, 30, 370, 20 '*********POP UP TO WARN OF DOWNLOAD ' Button #dlg.default, " Okay ", [OkayButton], UL, 140, 100 Open "DOWNLOADING" for Dialog_popup as #dlg2 end sub
[listbox1DoubleClick] 'Perform action for the listbox named 'listbox1' editSection$ ="" #main.textedit4 ,"!cls" ; #main.listbox1 , "selection? Wtask$" cursor hourglass CALL MyPopup Wtask$
urlfile$= "http://rosettacode.org/wiki/" + Wtask$ ' + "Liberty_BASIC" alltext$=httpget$(urlfile$) editSection$= FindEditSection$(alltext$) Code2Save$= GETCODE$(Wtask$ , editSection$) if Code2Save$="" then NOTICE "NO Code Found" #main.textedit4, Code2Save$ print "task 2 get :"; Wtask$;" ";editSection$ close #dlg2 cursor normal wait SUB listbox2DoubleClick handle$ cursor hourglass CALL MyPopup " AN UPDATED LIST" dim allTasks$(2000) #main.listbox1, "reload" #main.textedit4 ,"!cls" ; #main.combo1 , "selection? Wlang$" print Wlang$; " Wlang " a$ = httpGet$("http://rosettacode.org/wiki/Category:";Wlang$) a1$ = word$(a$,2,"Pages in category ";chr$(34);Wlang$) a1$ = upto$(a1$,"class=";chr$(34);"printfooter") ' a1$ = word$(a1$,1,"</tr></table>") w = 2 a2$ = word$(a1$,w,"<li><a href=";chr$(34);"/wiki/") ' while a2$ <> "" a2$ = left$(a2$,instr(a2$,chr$(34))-1) a2$ = strRep$(a2$,"%2B","+") a2$ = strRep$(a2$,"%27","'") ' print using("####",w-1);" ";a2$ allTasks$(w-1)=a2$ urlfile$= "http://rosettacode.org/wiki/" + a2$ '+ "#Liberty_BASIC"
w=w + 1 a2$ = word$(a1$,w,"<li><a href=";chr$(34);"/wiki/") wend #main.listbox1, "reload" open TaskList$ for output as #T for w=1 to w-2 #T,allTasks$(w) next close #T close #dlg2 #main.statictext3,"there are ";w-1;" Tasks" cursor normal END SUB wait
WAIT
[button2Click] 'Perform action for the button named 'button2' CALL MyPopup " AN UPDATED LIST" a$ = httpGet$("http://rosettacode.org/wiki/Category:Liberty_BASIC") ' get RB tasks from [RC] a1$ = word$(a$,2,"Pages in category ";chr$(34);Wlang$) a1$ = word$(a1$,1,"</tr></table>") i = 2 a2$ = word$(a1$,i,"<li><a href=";chr$(34);"/wiki/") ' ' Create a drop down window for selection of a task ' print "------------ LB List -----------------" while a2$ <> "" a2$ = left$(a2$,instr(a2$,chr$(34))-1) a2$ = strRep$(a2$,"%2B","+") a2$ = strRep$(a2$,"%27","'") ' print using("####",i-1);" ";a2$
allTasks$(i-1)=a2$
i = i + 1 a2$ = word$(a1$,i,"<li><a href=";chr$(34);"/wiki/") wend #main.listbox1, "reload" open TaskList$ for output as #T for i=1 to i-1 #T,allTasks$(i) next close #T close #dlg2 wait
[quit.main] 'End the program close #main end
' string replace rep str with FUNCTION strRep$(str$,rep$,with$) ln = len(rep$) ln1 = ln - 1 i = 1 while i <= len(str$) if mid$(str$,i,ln) = rep$ then strRep$ = strRep$ + with$ i = i + ln1 else strRep$ = strRep$ + mid$(str$,i,1) end if i = i + 1 WEND END FUNCTION
'Section number needed to isolate language specific code FUNCTION FindEditSection$(alltext$)
alltext$=after$(alltext$,Wlang$;"</a></span>") alltext$=after$(alltext$,"section=") alltext$=upto$(alltext$,"Edit section:") alltext$=upto$(alltext$,chr$(34)) FindEditSection$ = alltext$ END FUNCTION
FUNCTION GETCODE$(Wtask$,editSection$) ' checkit$="http://rosettacode.org/mw/index.php?title=100_doors&action=edit§ion=187" ' print checkit$; " CHECK" Url2get$="http://rosettacode.org/mw/index.php?title=";Wtask$;"&action=edit§ion=";editSection$ print " URL "; Url2get$
result$=httpget$(Url2get$) result$=after$(result$,"<lang ") result$=upto$(result$,"</lang>") result$=after$(result$,">")
GETCODE$= REPLSTR$(result$,"<","<") ' result$ END FUNCTION
SUB GETLANGS handle$ cursor hourglass CALL MyPopup " Languages are / " Bgn$="<h2>Subcategories</h2>" Lst$="</div></div></div><div class=";chr$(34);"printfooter" ' </div></div></div><div class="printfooter">
result$=httpget$("http://rosettacode.org/wiki/Category:Programming_Languages")
result$=after$(result$,Bgn$) result$=upto$(result$,Lst$) result$=replstr$(result$,"<ul>","") result$ = strRep$(result$,LF$,EOL$)
open "Languages.txt" for output as #lang #lang, result$ close #lang
open "Languages.txt" for input as #lang z=1 while eof(#lang)=0 line input #lang,Lang$ y1 = instr(Lang$,"<li><a href=") if y1 then Lang$= ReplStr$(Lang$,"<li><a href=";chr$(34);"/wiki/Category:","") Lang$=after$(Lang$,"Category:") Lang$=upto$(Lang$,chr$(34)) allLangs$(z) = Lang$
z=z+1 end if wend close #lang close #dlg2 #main.combo1 "reload" #main.combo1, "select ";"Liberty BASIC" cursor normal CALL listbox2DoubleClick handle$ ' call LoadTasks END SUB
SUB LoadTasks dim allTasks(2000) t=1 open TaskList$ for input as #T while not(eof(#T)) line input #T, allTasks$(t) ' print allTasks$(i) t=t+1 wend close #T #main.statictext3,"there are ";t-1;" Tasks" #main.listbox1, "reload" END SUB
[errorHandler] NOTICE "Error string is " + chr$(34) + Err$ + chr$(34);" Num ";err
end '
|
|