Tasp
Full Member
Posts: 215
|
Post by Tasp on May 28, 2020 12:02:13 GMT -5
I came across an issue the other day, when my PC decided to BSOD on me while writing to a CSV, meaning the file didn't finish the row. So then when reading this back in you end up with a halt, with an input past end of file error
My question is, is there a way of detecting that we're going to pass the EOF before we do and crash?
I currently like CSV files rather than RAF as they're considerably easyier to handle and setup.
The following code demonstrates the issue only, I'm aware this doesn't produce the correct length text file and is for illustration purposes only.
OPEN "file.txt" FOR APPEND AS #file print "1,2,3,4" close #file
OPEN "file.txt" FOR INPUT AS #file WHILE NOT(EOF(#file)) INPUTCSV #file, aa$, ab$, ac$, ad$, ae$,af$, ag$, ah$, ai$, aj$, ak$, al$, am$, an$ WEND CLOSE #file
|
|
|
Post by Rod on May 28, 2020 13:17:15 GMT -5
Yes, check the eof() condition before reading the next data item. The help file example covers it. So you cant input a stream of data you must input an item at a time.
|
|
|
Post by alincon on May 28, 2020 14:09:18 GMT -5
It appears that having to check for eof before every item negates any advantage of using inputcsv. Am I correct?
r.m.
|
|
|
Post by svajoklis on May 28, 2020 15:04:39 GMT -5
Checking for EOF if inputcsv only reads a single line does make sense and Tasp's example is completely reasonable too - for each inputcsv the while loop checks for EOF. The main problem is that inputcsv doesn't work if the line that is being read ends in an EOF, not a newline. Sounds like a problem, unless the inputcsv specification actually requires the csv file to end with an empty line instead of with just an EOF at the end of the last data line.
|
|
|
Post by Chris Iverson on May 28, 2020 15:51:21 GMT -5
EOF vs newline at the end of a complete record isn't the problem(and in fact, in the tests I've done, it's worked fine either way)
The problem is the input file not having enough remaining items to satisfy every stated variable in the INPUTCSV command.
If you have an INPUTCSV command that reads 8 items from the file, but the file only has four items left, you get a runtime error.
The problem with THAT is that it's unavoidable. There's no graceful way to check if a complete set of items is available, without trying to read them(and potentially failing to do so).
You can read the items in one by one using the INPUT command, but then that just makes the INPUTCSV command useless and unreliable. You can scan the file beforehand for errors, but to do that, you have to read the file in anyway, so you may as well do your file processing there, and again, skip the INPUTCSV command.
The only way to catch this while running is to use an error handler(ON ERROR GOTO), and bomb out of the code(and ruin any existing error handlers you might already have, and need to set them up again).
Sadly, this isn't really something that can be fixed in LB, because it's not really an LB problem. You can't tell if the data remaining in the file will satisfy your read request without attempting to read it. If you want to avoid crashes, switch to the INPUT command(while being mindful of the changes to the variables that will make, since INPUT and INPUTCSV read files differently), or use an error handler to catch bad situations.
If Carl makes the INPUTCSV not crash and instead just give up reading the file if an unexpected EOF happens, then you have no real way of telling if you got a complete record from the file or not, since you're given no indication either way.
|
|
|
Post by honkytonk on May 29, 2020 2:45:37 GMT -5
Try:
name "filename.csv" as "filename.txt"
|
|
|
Post by honkytonk on May 29, 2020 2:48:50 GMT -5
Try: name "filename.csv" as "filename.txt"
And return to filename.csv at closure for reopen (else error because "filename.csv" disapeared)
|
|
|
Post by Rod on May 29, 2020 3:10:01 GMT -5
CSV files are usually very rigid things. In normal circumstances if you are not at the end of file then there is another complete record set to retrieve. So in most cases eof() coexists very well with inputcsv. If your file is a mess there is really nothing that Liberty can do to fix it. Garbage in …… If you have to parse a messy CSV file you would need to do it a data item at a time but really you wont know what place to keep so it is a pretty pointless exercise. The real fix is to get the file right in the first place.
|
|
|
Post by svajoklis on May 29, 2020 9:34:32 GMT -5
Is there any error handling to maybe catch that the last inputcsv was wrong and handle it somehow?
|
|
|
Post by Rod on May 29, 2020 10:46:22 GMT -5
Yes, as Chris posted, create an on error event for that circumstance. But what to do with the error? As I said by far the easiest and most logical fix is to make sure the input file is fit for purpose.
|
|
|
Post by Chris Iverson on May 29, 2020 10:52:24 GMT -5
Pretty much agreed with Rod here.
I do actually recommend using an error handler in a situation like this so that your code can gracefully handle the situation without just crashing in front of a customer/user of your product, but the only thing you can really do with the error handler is close the file and notify the user that the file was incomplete/corrupted.
It's a good thing to be able to keep your program going, but there's not much you'd be able to do with the corrupted file.
Depending on what your program actually does with the data, you may decide to just work with all the complete records you can, or you may decide to abort processing entirely. But that's a decision you have to make for your program.
|
|
|
Post by metro on May 29, 2020 19:19:20 GMT -5
Interesting that there is not an error on the offending line INPUTCSV just gets the next variable from the next line until it reaches the end of the file.
OPEN "file.txt" FOR output AS #file print #file,"1,2,3,4,5" print #file,"2,7,8,9,10" print #file,"3,7,8,9,11" print #file,"4,7,8,9" print #file,"5,7,8,9,11" print #file,"6,7,8,9,11" close #file
on error goto [errorHandler]
OPEN "file.txt" FOR INPUT AS #file check=1 WHILE NOT(EOF(#file)) INPUTCSV #file, aa$, ab$, ac$, ad$, ae$ print aa$, ab$, ac$, ad$, ae$ check=check+1 WEND CLOSE #file
[errorHandler] print "Error string is " + chr$(34) + Err$ + chr$(34) print "Error number is ";Err; " data corrupt on line number ",check
wait
|
|
|
Post by Brandon Parker on May 29, 2020 20:26:21 GMT -5
metro, That is interesting, but one would not expect that a data entry would be missing in the middle of the file. It would be more likely that the process creating the file would crash and that would be it. One would end up with all rows being complete except for the last one. I think that InputCSV is just Inputing the number of items using CSV rules and that is why it just goes on to the next one in the file. At least that is how I perceive what you show occurring. Here is how I might do it; no guarantees though ... You would not necessarily need the OpenFile$() function, but I used it extensively in one of my projects and it keeps everything nice and tidy. You could also make the function return True/False depending on whether it succeeded or not, or you could change it to a string return function and return a concatenated string of variables as you see fit. Global False : False = 0 Global True : True = 1 Global CRLF$ : CRLF$ = chr$(13) + chr$(10)
myFilePath$ = "Your File Path Here"
result = inputCSVFile(myFilePath$) Print "Completed!!" End
Function inputCSVFile(filePath$) On Error GoTo [Error] 'Input the data in and trim off the whitespace data$ = Trim$(OpenFile$(filePath$, "Input", "")) 'Write the file back to disk result$ = OpenFile$(filePath$, "Output", data$) 'Get the data after the last CRLF$; assumes CRLF$ separates lines in the CSV lastRow$ = AfterLast$(data$, CRLF$) 'Get the number of rows in the file numRows = (CountSubstring(data$, CRLF$, 1, 0) + 1) 'Get the number of items in the last row; assumes no commas are present within cells 'You would have to write your own to account for commas within strings in CSV cells, but it would not be too difficult lastRowItemCount = (CountSubstring(lastRow$, ",", 1, 0) + 1) Open filePath$ For Input As #Test While Not(EOF(#Test)) 'You will need to update the InputCSV line to match your expected variables InputCSV #Test, var1$, var2$, var3$, var4$, var5$, var6$, var7$, var8$, var9$, var10$, var11$, var12$, var13$ 'Printing out the values here to see what they are Print var1$;", ";var2$;", ";var3$;", ";var4$;", ";var5$;", ";var6$;", ";var7$;", ";var8$;", ";var9$;", ";var10$;", ";var11$;", ";var12$;", ";var13$ Wend Close #Test Exit Function [Error] 'We errored out; you can choose what to do here Print "Error# - ";Err;" : Error Description - ";Err$ Select Case Err 'Error# - 62 : Error Description - Input past end of file: #fileHandle Case 62 Print "Please handle the last row (#";numRows;"): ";lastRow$ 'Do something with your last row here...fix the file...or whatever 'You have the numRows variable where the last row that would be messed up is 'as well as the lastRowItemCount variable which should give you the number of items in the row '... following the assumptions made above for them Case Else Print "You decide what to do!" End Select 'Close the file because we errored out of the While Loop Close #Test End Function
'_________________________________________________________________________________________________________________________________________________________ '_________________________________________________________________________________________________________________________________________________________
Function CountSubstring(ByRef string$, substring$, position, CountSubstring) position = Instr(string$, substring$, position) If position Then CountSubstring = CountSubstring(string$, substring$, (position + Len(substring$)), (CountSubstring + 1)) End Function
'_________________________________________________________________________________________________________________________________________________________ '_________________________________________________________________________________________________________________________________________________________
Function OpenFile$(filepath$, InOut$, data$) On Error GoTo [Error] Select Case InOut$ Case "BinaryInput" Open filepath$ For Binary As #OpenFile fileOpened = True OpenFile$ = Trim$(Input$(#OpenFile, LOF(#OpenFile))) Case "Input" Open filepath$ For Input As #OpenFile fileOpened = True OpenFile$ = Trim$(Input$(#OpenFile, LOF(#OpenFile))) Case "BinaryOutput" Open filepath$ For Binary As #OpenFile fileOpened = True Case "Output" Open filepath$ For Output As #OpenFile fileOpened = True Case "BinaryAppend" Open filepath$ For Binary As #OpenFile fileOpened = True Case "Append" Open filepath$ For Append As #OpenFile fileOpened = True End Select If (Instr(InOut$, "Output") > False) Or (Instr(InOut$, "Append") > False) Then #OpenFile data$; OpenFile$ = str$(True) End If Close #OpenFile Exit Function [Error] If (fileOpened = True) Then Close #OpenFile End Function {:0) Brandon Parker
|
|