Is it safe to assume any errors returned from strconv.Parse* functions must be due to bad input data?

76 views Asked by At

In a recent code review, a reviewer took issue with how I handled the error returned from strconv.ParseUint(). This function is documented to return the converted uint value and an error of the *strconv.NumError concrete type. The docs mention two sentinel errors of that type that could be returned (ErrSyntax and ErrRange), both of which imply bad data was provided to it. According to the function's interface, any other error appears possible as well.

For my use case, I needed to know if the string value I have is worthy of being converted to a uint or not. If ParseUint returns an error, and it's one of the sentinel errors, then I got my answer. But if the returned error is neither of those, then I returned it and stopped execution. My reviewer asserted that I should rather assume that any error returned from ParseUint means I gave it bad data, and there's no need to check against the sentinel errors, that there's no reason to check against the sentinel errors, nor to ever return that error (in my use case). They linked to an example in go's standard libraries where the error from ParseUint is treated as a check for bad input data and never returned, and said there are many such examples.

While I can certainly understand that there must exist an algorithm that given good data and sufficient resources will always be able to compute the desired result, the real world does not always match the theoretical ideal. I cannot find anything in the library's documentation that says it does not, and will never, return an error for any other reason but bad data. That the standard library has one and perhaps many such examples is on the one hand reassuring and on the other hand scary, somewhere between "two wrongs don't make a right" with "they're doing it so it must be safe for us to do it too" in the case.

Is this just a case of the library's docs missing a sentence? Or is it good to return the error when it's neither of the two sentinel errors? How should I reason about this?

2

There are 2 answers

1
E. Moffat On BEST ANSWER

Yes, it is safe to assume errors from strconv.ParseXXX functions are due to bad input data.

From the documentation page you mentioned:

The errors that ParseInt returns have concrete type *NumError and include err.Num = s. If s is empty or contains invalid digits, err.Err = ErrSyntax and the returned value is 0; if the value corresponding to s cannot be represented by a signed integer of the given size, err.Err = ErrRange and the returned value is the maximum magnitude integer of the appropriate bitSize and sign.

The way I read this is "any error from strconv.ParseXXX is a NumError and it will be either due to invalid digits or a bitsize range error". My understanding is also that godocs attempt to be as complete as possible in outlining the range of expectations for function calls.

Therefore, I think it is safe to assume that these are the only errors that you might see come back from strconv.ParseXXX functions. If something else came back, I would consider it a documentation bug.

To answer your final question: The pattern you've observed in standard library calls to this function is the correct one. Return the error as a whole and let the caller decide what to do with it. The sentinel errors are to help you know what went wrong and represent the full scope of possible errors from these functions.

0
Burak Serdar On

I don't think there is a single correct answer.

For most cases, if ParseUint returns an error, you assume it is a bad input and return that error. You don't even check if it is one of the two concrete error types. There really is no other error ParseUint can logically return. For most self-contained parse/decode/unmarshal functions, all errors are different variations of input errors.

There are cases where you might want to check the exact error type. For example, if ParseUint returns ErrRange you might want to try parsing it as big.Int. Then checking the error type makes sense.

Go's interface implementation might confuse some, especially those who are used to error/exception handling in different languages. Go functions return the interface error type to denote an error, not a concrete error. Returning a concrete type for an error from a function causes nothing but trouble. So even if a function can return a single concrete type of error, it is declared to return error.