dkl
Full Member
Posts: 234
|
Post by dkl on Aug 15, 2021 19:09:05 GMT -5
Sorry Gentlemen,
I have asked this question before and search all my posts but cannot find the relevant one. I want to find each position of "name" in the file below and using that info and then extract the next bit of info as seen below
I used the following code to get the info below....... pos = instr(inf$,"actor"):if pos <> 0 then actor1$ = after$(inf$,"[{"):actor2$ = upto$(actor1$,"}]") (NOTE: I am aware that after$ and upto$ can be combined, but I always get the syntax wrong, hence what you see there!!)
this makes the array$ - actor2$
"@type":"Person","url":"/name/nm0915208/","name":"Naomi Watts"},{"@type":"Person","url":"/name/nm0000705/","name":"Robin Wright"},{"@type":"Person","url":"/name/nm1882152/","name":"Xavier Samuel"
Now I want to extract the info from actor2$.The info below will do the trick, but it only finds the first instance.
pos = instr(actor2$,chr$(34);"name";chr$(34);":";chr$(34)) if pos <> 0 then actor3$ = after$(actor2$,chr$(34);"name";chr$(34);":";chr$(34)):actor4$ = upto$(actor3$,chr$(34);"}")
I have tried a for/next loop and a do/loop and I'm aware that I have to 'move pos on' in order to continue the search.
In a past example I saw pos = pos + 1, but that doesn't seem to work here.
Some help would be appreciated please.
On another note, is there a simple 'rule of thumb' when adding chr$(34) to array$ with multiple quotation symbols?
|
|
|
Post by Walt Decker on Aug 15, 2021 19:51:07 GMT -5
I = 1 [DO.LOOP]
I = INSTR(I, Source$, Match$)
IF I = 0 THEN GOTO [END.LOOP] IF I THEN DoSomething I = I + LEN(Match$) END IF
GOTO [DO.LOOP]
[END.LOOP]
|
|
dkl
Full Member
Posts: 234
|
Post by dkl on Aug 16, 2021 0:44:48 GMT -5
Thanks Walt, that's what I meant (the 'I' variable should be at the end, not the beginning), but not to worry I realised that! I got the code to print out the position of each search. However, I'm still flummaxed as to how once I find each position with 'INSTR' after then telling it to store some info from then array$, how to store info for the rest of the array$. I still keep getting the same piece of info saved, I'm not getting the rest of the array$. I want to find "NAME" and then store the name immediately after. all I get is the first name in the array$ continually!
|
|
|
Post by Walt Decker on Aug 16, 2021 8:57:39 GMT -5
How about posting a couple of actual data sets so they can be analyzed. I am certain that I, or someone with more knowledge than I, can come up with something.
Ok, I see what you are getting at.
this makes the array$ - actor2$ "@type":"Person","url":"/name/nm0915208/","name":"Naomi Watts"},{"@type":"Person","url":"/name/nm0000705/","name":"Robin Wright"},{"@type":"Person","url":"/name/nm1882152/","name":"Xavier Samuel"
Actor2$ appears to be a single string rather than an array of strings. Why not make it an actual array of strings then parse each element from back to front for the first colon. Then extract the name from that?
|
|
|
Post by Walt Decker on Aug 16, 2021 13:45:37 GMT -5
Using your data for Actor2$:
' 'this makes the array$ - actor2$ DQ$ = CHR$(34) CM$ = "," C$ = ":"
Actor3$ = "" Actor2$ = DQ$ + "@type" + C$+ DQ$ + "Person" + DQ$ + CM$ + DQ$ + "url" + DQ$ Actor2$ = Actor2$ + C$ + DQ$ + "/name/nm0915208/" + DQ$ + CM$ + DQ$ + "name" + DQ$ Actor2$ = Actor2$ + C$ + DQ$ + "Naomi Watts" + DQ$ + "}" Actor2$ = Actor2$ + "," + "{" + DQ$ + "@type" + ":" + DQ$ + "Person" + DQ$ Actor2$ = Actor2$ + "," + DQ$ + "url" + DQ$ + ":" + DQ$ + "/name/nm0000705/" + DQ$ Actor2$ = Actor2$ + "," + DQ$ + "name" + DQ$ + ":" + DQ$ + "Robin Wright" + DQ$ Actor2$ = Actor2$ + "}" + "," + "{" + DQ$ + "@type" + DQ$ + ":" + DQ$ Actor2$ = Actor2$ + DQ$ + "Person" + DQ$ + "," + DQ$ + "url" + DQ$ Actor2$ = Actor2$ + ":" + DQ$ + "/name/nm1882152/" + DQ$ + "," Actor2$ = Actor2$ + DQ$ + "name" + DQ$ + ":" + DQ$ + "Xavier Samuel" + DQ$
'"@type":"Person","url":"/name/nm0915208/","name":"Naomi Watts"},{"@type":"Person","url":"/name/nm0000705/","name":"Robin Wright"},{"@type":"Person","url":"/name/nm1882152/","name":"Xavier Samuel" print Actor2$
Actor2$ = REMCHAR$(Actor2$, DQ$)
print Actor2$ FOR I = LEN(Actor2$) TO 1 STEP -1 IF MID$(Actor2$, I, 1) = "{" THEN J = INSTR(Actor2$, "}", I) IF J = 0 THEN Actor3$ = MID$(Actor2$, I) PRINT Actor3$
FOR K = LEN(Actor3$) TO 1 STEP -1 IF MID$(Actor3$, K, 1) = ":" THEN Actor4$ = Actor4$ + "|" + MID$(Actor3$, K + 1) PRINT Actor4$ EXIT FOR END IF NEXT K ELSE Actor3$ = MID$(Actor2$, I, J - I) FOR K = LEN(Actor3$) TO 1 STEP -1 IF MID$(Actor3$, K, 1) = ":" THEN Actor4$ = Actor4$ + "|" + MID$(Actor3$, K + 1) PRINT Actor4$ EXIT FOR END IF NEXT K END IF END IF NEXT I
J = INSTR(Actor2$, "}") FOR I = J - 1 TO 1 STEP -1 IF MID$(Actor2$, I, 1) = ":" THEN Actor4$ = Actor4$ + "|" + MID$(Actor2$, I + 1, J - I - 1) PRINT Actor4$ EXIT FOR END IF NEXT I '
It might be a bit easier if your data started with "{" and ended with "}".
|
|
|
Post by Rod on Aug 16, 2021 14:02:54 GMT -5
As Walt says the data needs cleaned up and needs an ending } else more code to catch that. Here is a version with te pesky " removed to make the demo simpler.
a$="@type:Person,url:/name/nm0915208/,name:Naomi Watts},{@type:Person,url:/name/nm0000705/,name:Robin Wright},{@type:Person,url:/name/nm1882152/,name:Xavier Samuel}" tagstart$="name:" tagend$="}" 'find first item pos=instr(a$,tagstart$,1) while pos 'now find end of item posend=instr(a$,tagend$,pos+1) 'print the item print mid$(a$,pos+5,posend-pos-5) 'move the pointer on pos=posend 'now find the next item if it exists pos=instr(a$,tagstart$,pos) wend
|
|
|
Post by tsh73 on Aug 16, 2021 15:56:13 GMT -5
It really looks like (part of) Python code to me (just learning). See?
a=({"@type":"Person","url":"/name/nm0915208/","name":"Naomi Watts"}, {"@type":"Person","url":"/name/nm0000705/","name":"Robin Wright"}, {"@type":"Person","url":"/name/nm1882152/","name":"Xavier Samuel"} )
for i in range(3): print (a[i]["name"])
produces
Naomi Watts Robin Wright Xavier Samuel
Just looked - probably this is JSON format Anyway, both of these could explain why so much strange delimiters.
EDIT: Full JSON is probably hard to do with all that nesting But subset - just set of pairs "name:value", without nesting - is surely doable
Since it is standard format (or subset), and essentially just set of pairs "name=value", probably some LB code exists/could be adapted (some object browsers by Carl etc) to work with it on higher level? Read/parse, create such strings. - Like, (pseudocode) (pretending quotes set right / data read from a file)
obj$='{"@type":"Person","url":"/name/nm0000705/","name":"Robin Wright"}' print obj.objType(obj$) print obj.value(obj$, "name") print obj.hasKey(bj$, "lastName") print obj.keys(obj$)
|
|
|
Post by tsh73 on Aug 17, 2021 6:44:45 GMT -5
Based on Walt data generator.
EDIT realized all functions I mentioned in previous post.
Results
"@type":"Person","url":"/name/nm0915208/","name":"Naomi Watts"},{"@type":"Person","url":"/name/nm0000705/","name":"Robin Wright"},{"@type":"Person","url":"/name/nm1882152/","name":"Xavier Samuel"
Number of objects 3
1 "@type":"Person","url":"/name/nm0915208/","name":"Naomi Watts" Naomi Watts @type >Person< missingKey ><
2 "@type":"Person","url":"/name/nm0000705/","name":"Robin Wright" Robin Wright @type >Person< missingKey ><
3 "@type":"Person","url":"/name/nm1882152/","name":"Xavier Samuel" Xavier Samuel @type >Person< missingKey ><
Object 2 "@type":"Person","url":"/name/nm0000705/","name":"Robin Wright"
its' type Person
Does it have key @type >1< missingKey >0< url >1<
its' keys @type|url|name
'this makes the array$ - actor2$ global DQ$ 'used everythere DQ$ = CHR$(34) CM$ = "," C$ = ":"
Actor3$ = "" Actor2$ = DQ$ + "@type" + DQ$ + C$+ DQ$ + "Person" + DQ$ + CM$ + DQ$ + "url" + DQ$ Actor2$ = Actor2$ + C$ + DQ$ + "/name/nm0915208/" + DQ$ + CM$ + DQ$ + "name" + DQ$ Actor2$ = Actor2$ + C$ + DQ$ + "Naomi Watts" + DQ$ + "}" Actor2$ = Actor2$ + "," + "{" + DQ$ + "@type" + DQ$ + ":" + DQ$ + "Person" + DQ$ Actor2$ = Actor2$ + "," + DQ$ + "url" + DQ$ + ":" + DQ$ + "/name/nm0000705/" + DQ$ Actor2$ = Actor2$ + "," + DQ$ + "name" + DQ$ + ":" + DQ$ + "Robin Wright" + DQ$ Actor2$ = Actor2$ + "}" + "," + "{" + DQ$ + "@type" + DQ$ + ":" Actor2$ = Actor2$ + DQ$ + "Person" + DQ$ + "," + DQ$ + "url" + DQ$ Actor2$ = Actor2$ + ":" + DQ$ + "/name/nm1882152/" + DQ$ + "," Actor2$ = Actor2$ + DQ$ + "name" + DQ$ + ":" + DQ$ + "Xavier Samuel" + DQ$
'"@type":"Person","url":"/name/nm0915208/","name":"Naomi Watts"},{"@type":"Person","url":"/name/nm0000705/","name":"Robin Wright"},{"@type":"Person","url":"/name/nm1882152/","name":"Xavier Samuel" print Actor2$ print 'Actor2$ = trim$(Actor2$) 'if left$(Actor2$,1)<>"{" then Actor2$="{"+Actor2$ 'if right$(Actor2$,1)<>"}" then Actor2$=Actor2$+"}" 'word$ eats delimiter, so first/last {} not needed anyway
'getting separate "objects" delimited with "},{" N=nWords(Actor2$, "},{") print "Number of objects ";N dim obj$(N) 'so I have it later, just to sho off names$ function print
w$="" i=0 while 1 i=i+1 w$ = word$(Actor2$, i, "},{") if w$="" then exit while obj$(i)=w$ print i, w$ 'now get value by name print objValue$(w$, "name")
toFind$="@type": print toFind$, ">";objValue$(w$, toFind$);"<" toFind$="missingKey": print toFind$, ">";objValue$(w$, toFind$);"<" 'just "" print wend
print "Object ";2 print obj$(2) obj$ = obj$(2)
print print "its' type" print objType$(obj$)
print print "Does it have key" toFind$="@type": print toFind$, ">";objHasKey(obj$, toFind$);"<" toFind$="missingKey": print toFind$, ">";objHasKey(obj$, toFind$);"<" 'just "" toFind$="url": print toFind$, ">";objHasKey(obj$, toFind$);"<" 'just ""
print print "its' keys" print objKeys$(obj$(2))
end '------------------------------------------------------ function nWords(s$, delim$) i=0 while 1 i=i+1 w$ = word$(s$, i, delim$) if w$="" then exit while wend nWords = i-1 end function
function stripDQ$(a$) if left$(a$, 1)=DQ$ then a$ = mid$(a$,2) if right$(a$, 1)=DQ$ then a$ = left$(a$,len(a$)-1) stripDQ$ = a$ end function
function objType$(obj$) objType$ = objValue$(obj$, "@type") end function
function objHasKey(obj$, key$) objHasKey = instr("|"+objKeys$(obj$)+"|", "|"+key$+"|")>0 end function
function objKeys$(obj$) 'get pairs 'srip DQ 'get aName w$="" i=0 while 1 i=i+1 pair$ = word$(obj$, i, DQ$;",";DQ$) if pair$="" then exit while pair$=stripDQ$(pair$)
aName$ = word$(pair$, 1, DQ$;":";DQ$) 'here we actually break no-quote, like ("Id":123) objKeys$=objKeys$+"|"+aName$ wend objKeys$=mid$(objKeys$, 2) end function
function objValue$(obj$, name$) name$ = stripDQ$(name$) 'get pairs 'srip DQ 'get aName 'if match, return aVla w$="" i=0 while 1 i=i+1 pair$ = word$(obj$, i, DQ$;",";DQ$) if pair$="" then exit while pair$=stripDQ$(pair$)
aName$ = word$(pair$, 1, DQ$;":";DQ$) 'here we actually break no-quote, like ("Id":123) if aName$ = name$ then aVal$ = word$(pair$, 2, DQ$;":";DQ$) objValue$ = aVal$ exit function end if wend
end function
|
|
|
Post by Walt Decker on Aug 17, 2021 12:28:43 GMT -5
Here is a function that will take any delimiter and fill an array with the fields defined by that delimiter. If the case of the delimiter is unimportant set the last parameter in the argument to zero as in the example.
' 'this makes the array$ - actor2$ DQ$ = CHR$(34) CM$ = "," C$ = ":"
Ubnd = -1
Actor3$ = "" Actor2$ = DQ$ + "@type" + C$+ DQ$ + "Person" + DQ$ + CM$ + DQ$ + "url" + DQ$ Actor2$ = Actor2$ + C$ + DQ$ + "/name/nm0915208/" + DQ$ + CM$ + DQ$ + "name" + DQ$ Actor2$ = Actor2$ + C$ + DQ$ + "Naomi Watts" + DQ$ + "}" Actor2$ = Actor2$ + "," + "{" + DQ$ + "@type" + ":" + DQ$ + "Person" + DQ$ Actor2$ = Actor2$ + "," + DQ$ + "url" + DQ$ + ":" + DQ$ + "/name/nm0000705/" + DQ$ Actor2$ = Actor2$ + "," + DQ$ + "name" + DQ$ + ":" + DQ$ + "Robin Wright" + DQ$ Actor2$ = Actor2$ + "}" + "," + "{" + DQ$ + "@type" + DQ$ + ":" + DQ$ Actor2$ = Actor2$ + DQ$ + "Person" + DQ$ + "," + DQ$ + "url" + DQ$ Actor2$ = Actor2$ + ":" + DQ$ + "/name/nm1882152/" + DQ$ + "," Actor2$ = Actor2$ + DQ$ + "name" + DQ$ + ":" + DQ$ + "Xavier Samuel" + DQ$
'"@type":"Person","url":"/name/nm0915208/","name":"Naomi Watts"},{"@type":"Person","url":"/name/nm0000705/","name":"Robin Wright"},{"@type":"Person","url":"/name/nm1882152/","name":"Xavier Samuel" PRINT Actor2$
Actor2$ = REMCHAR$(Actor2$, DQ$)
PRINT Actor2$
Ubnd = FN.GetAllFields(Actor2$, "NAME:", 0) PRINT Ubnd FOR I = 0 TO Ubnd PRINT I, Ary$(I) NEXT I END
'----------------------------------------------------------------- '-----------------------------------------------------------------
FUNCTION FN.GetAllFields(TxtIn$, Dlm$, MatchCase)
DlmLen = 0 NumFlds = 0 AryCntr = 0 NumChrs = 0 EndPosn = 0 Posn = 0 I = 0
TmpIn$ = ""
DIM FldPos(1000) DIM Ary$(-1)
TmpIn$ = TxtIn$ IF MatchCase = 0 THEN Dlm$ = UPPER$(Dlm$) TmpIn$ = UPPER$(TmpIn$) END IF
DlmLen = LEN(Dlm$) trace 2 I = 1 NumFlds = -1 [GET.FIELDS] I = INSTR(TmpIn$, Dlm$, I)
IF I = 0 THEN GOTO [END.FIELDS]
IF I THEN NumFlds = NumFlds + 1 FldPos(NumFlds) = I I = I + 1 GOTO [GET.FIELDS] END IF
[END.FIELDS] IF NumFlds < 0 THEN EXIT FUNCTION
REDIM Ary$(NumFlds + 1)
Ary$(0) = LEFT$(TxtIn$, FldPos(0) - 1) PRINT "ary0 = ";Ary$(0), FldPos(0) FOR I = 0 TO NumFlds - 1 Posn = FldPos(I) + DlmLen EndPosn = FldPos(I + 1) NumChrs = EndPosn - Posn AryCntr = AryCntr + 1 Ary$(AryCntr) = MID$(TxtIn$, Posn, NumChrs) NEXT I AryCntr = AryCntr + 1 Posn = FldPos(I) + DlmLen Ary$(AryCntr) = MID$(TxtIn$, Posn)
FN.GetAllFields = NumFlds + 1 END FUNCTION '
|
|
|
Post by Walt Decker on Aug 17, 2021 15:28:46 GMT -5
Here are a couple of string functions that may be of use to someone:
' FUNCTION FN.GetField$(Source$, Dlm$, FieldNum, MatchCase)
NumFlds = 0 FldStrt = 0 NumChrs = 0 FldEnd = 0 DlmLen = 0 I = 0
DIM FldPos(1000)
TmpSrc$ = ""
IF Source$ = "" THEN EXIT FUNCTION END IF
TmpSrc$ = Source$
IF MatchCase = 0 THEN Dlm$ = UPPER$(Dlm$) TmpSrc$ = UPPER$(TmpSrc$) END IF
DlmLen = LEN(Dlm$)
IF FieldNum > 0 THEN FieldNum = FieldNum - 1
NumFlds = -1 I = 1 [FIND.FIELDS] I = INSTR(TmpSrc$, Dlm$, I)
IF I = 0 THEN GOTO [END.FIELDS]
NumFlds = NumFlds + 1 FldPos(NumFlds) = I I = I + 1 GOTO [FIND.FIELDS]
[END.FIELDS] IF NumFlds < 0 THEN REDIM FldPos(-1) FN.GetField$ = Source$ EXIT FUNCTION END IF
IF FieldNum > NumFlds + 1 THEN REDIM FldPos(-1) EXIT FUNCTION END IF
IF FieldNum = 0 THEN FN.GetField$ = LEFT$(Source$, FldPos(0) - 1) REDIM FldPos(-1) EXIT FUNCTION END IF
IF FieldNum = NumFlds + 1 THEN FldStrt = FldPos(NumFlds) + DlmLen FN.GetField$ = MID$(Source$, FldStrt) REDIM FldPos(-1) EXIT FUNCTION END IF
FldStrt = FldPos(FieldNum - 1) + DlmLen FldEnd = FldPos(FieldNum) NumChrs = FldEnd - FldStrt
REDIM FldPos(-1) FN.GetField$ = MID$(Source$, FldStrt, NumChrs)
END FUNCTION
'---------------------------------------------------------------- '----------------------------------------------------------------
FUNCTION FN.GetFieldCount(Source$, Dlm$, MatchCase)
FldCnt = 0 I = 0
IF Source$ = "" THEN EXIT FUNCTION
IF MatchCase = 0 THEN Source$ = UPPER$(Source$) Dlm$ = UPPER$(Dlm$) END IF
I = 1
[START.COUNT] I = INSTR(Source$, Dlm$, I)
IF I = 0 THEN GOTO [END.COUNT]
FldCnt = FldCnt + 1 GOTO [START.COUNT]
[END.COUNT]
FN.GetFieldCount = FldCnt + 1 END FUNCTION
'
|
|
dkl
Full Member
Posts: 234
|
Post by dkl on Aug 17, 2021 23:54:07 GMT -5
WOW, so much response Thank you. I'm sorry I should have mentioned that the data was in JSON. Stripping the code to what I wanted would be a lot easier if I had been using Python, but I can't actually code in Python (!), although have seen many examples of web scraping using it. Rod's example was similar to what I was looking and I think he may have helped me out before, but I couldn't find the example. I actually managed to come up with a solution, with code which I added to the code in my first post actor$ = actor$ +actor4$ pos = pos + len(match$)'or +1 - both work actor2$ = mid$(actor2$,pos,len(actor2$)):pos = 1'<- this shortens the string and keeps 'pos' at position 1, where Rod moves 'pos' along/thru the array, which was what I was initially trying to do actor$ = replstr$(actor$,chr$(34)," ") print "ACTOR$-> ";actor$ It effectively shortens the array$ (actors2$) after every find of 'name' It's difficult to emulate here as the initial array$ (actor2$) string was plucked out of a long piece of JSON code. Trying to make a new array$ from it requires adding all the extra code - as WALT did. However, many thanks for all your help. Now I have other ways of looking at the solution next time
|
|