[swift-evolution] typed throws

Matthew Johnson matthew at anandabits.com
Fri Aug 18 20:11:01 CDT 2017



Sent from my iPad

> On Aug 18, 2017, at 6:56 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
> 
> Joe Groff wrote:
> 
> An alternative approach that embraces the open nature of errors could be to represent domains as independent protocols, and extend the error types that are relevant to that domain to conform to the protocol. That way, you don't obscure the structure of the underlying error value with wrappers. If you expect to exhaustively handle all errors in a domain, well, you'd almost certainly going to need to have a fallback case in your wrapper type for miscellaneous errors, but you could represent that instead without wrapping via a catch-all, and as?-casting to your domain protocol with a ??-default for errors that don't conform to the protocol. For example, instead of attempting something like this:
> 
> enum DatabaseError {
>   case queryError(QueryError)
>   case ioError(IOError)
>   case other(Error)
> 
>   var errorKind: String {
>     switch self {
>       case .queryError(let q): return "query error: \(q.query)"
>       case .ioError(let i): return "io error: \(i.filename)"
>       case .other(let e): return "\(e)"
>     }
>   }
> }
> 
> func queryDatabase(_ query: String) throws /*DatabaseError*/ -> Table
> 
> do {
>   queryDatabase("delete * from users")
> } catch let d as DatabaseError {
>   os_log(d.errorKind)
> } catch {
>   fatalError("unexpected non-database error")
> }
> 
> You could do this:
> 
> protocol DatabaseError {
>   var errorKind: String { get }
> }
> 
> extension QueryError: DatabaseError {
>   var errorKind: String { return "query error: \(q.query)" }
> }
> extension IOError: DatabaseError {
>   var errorKind: String ( return "io error: \(i.filename)" }
> }
> 
> extension Error {
>   var databaseErrorKind: String {
>     return (error as? DatabaseError)?.errorKind ?? "unexpected non-database error"
>   }
> }
> 
> func queryDatabase(_ query: String) throws -> Table
> 
> do {
>   queryDatabase("delete * from users")
> } catch {
>   os_log(error.databaseErrorKind)
> }

This approach isn't sufficient for several reasons.  Notably, it requires the underlying errors to already have a distinct type for every category we wish to place them in.  If all network errors have the same type and I want to categorize them based on network availability, authentication, dropped connection, etc I am not able to do that.  

The kind of categorization I want to be able to do requires a custom algorithm.  The specific algorithm is used to categorize errors depends on the dynamic context (i.e. the function that is propagating it).  The way I usually think about this categorization is as a conversion initializer as I showed in the example, but it certainly wouldn't need to be accomplished that way.  The most important thing IMO is the ability to categorize during error propagation and make information about that categorization easy for callers to discover.

The output of the algorithm could use various mechanisms for categorization - an enum is one mechanism, distinct types conforming to appropriate categorization protocols is another.  Attaching some kind of category value to the original error or propagating the category along with it might also work (although might be rather clunky).

It is trivial to make the original error immediately available via an `underlyingError` property so I really don't understand the resistance to wrapping errors.  The categorization can easily be ignored at the catch site if desired.  That said, if we figure out some other mechanism for categorizing errors, including placing different error values of the same type into different categories, and matching them based on this categorization I think I would be ok with that.  Using wrapper types is not essential to solving the problem.

Setting all of this aside, surely you had you had your own reasons for supporting typed errors in the past.  What were those and why do you no longer consider them important?

> 
> 
>> On Fri, Aug 18, 2017 at 6:46 PM, Matthew Johnson <matthew at anandabits.com> wrote:
>> 
>>> On Aug 18, 2017, at 6:29 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>>> 
>>>> On Fri, Aug 18, 2017 at 6:19 PM, Matthew Johnson <matthew at anandabits.com> wrote:
>>>> 
>>>>> On Aug 18, 2017, at 6:15 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>>>>> 
>>>>>> On Fri, Aug 18, 2017 at 09:20 Matthew Johnson via swift-evolution <swift-evolution at swift.org> wrote:
>>>>>> 
>>>>>> 
>>>>>> Sent from my iPad
>>>>>> 
>>>>>> On Aug 18, 2017, at 1:27 AM, John McCall <rjmccall at apple.com> wrote:
>>>>>> 
>>>>>> >> On Aug 18, 2017, at 12:58 AM, Chris Lattner via swift-evolution <swift-evolution at swift.org> wrote:
>>>>>> >> Splitting this off into its own thread:
>>>>>> >>
>>>>>> >>> On Aug 17, 2017, at 7:39 PM, Matthew Johnson <matthew at anandabits.com> wrote:
>>>>>> >>> One related topic that isn’t discussed is type errors.  Many third party libraries use a Result type with typed errors.  Moving to an async / await model without also introducing typed errors into Swift would require giving up something that is highly valued by many Swift developers.  Maybe Swift 5 is the right time to tackle typed errors as well.  I would be happy to help with design and drafting a proposal but would need collaborators on the implementation side.
>>>>>> >>
>>>>>> >> Typed throws is something we need to settle one way or the other, and I agree it would be nice to do that in the Swift 5 cycle.
>>>>>> >>
>>>>>> >> For the purposes of this sub-discussion, I think there are three kinds of code to think about:
>>>>>> >> 1) large scale API like Cocoa which evolve (adding significant functionality) over the course of many years and can’t break clients.
>>>>>> >> 2) the public API of shared swiftpm packages, whose lifecycle may rise and fall - being obsoleted and replaced by better packages if they encounter a design problem.
>>>>>> >> 3) internal APIs and applications, which are easy to change because the implementations and clients of the APIs are owned by the same people.
>>>>>> >>
>>>>>> >> These each have different sorts of concerns, and we hope that something can start out as #3 but work its way up the stack gracefully.
>>>>>> >>
>>>>>> >> Here is where I think things stand on it:
>>>>>> >> - There is consensus that untyped throws is the right thing for a large scale API like Cocoa.  NSError is effectively proven here.  Even if typed throws is introduced, Apple is unlikely to adopt it in their APIs for this reason.
>>>>>> >> - There is consensus that untyped throws is the right default for people to reach for for public package (#2).
>>>>>> >> - There is consensus that Java and other systems that encourage lists of throws error types lead to problematic APIs for a variety of reasons.
>>>>>> >> - There is disagreement about whether internal APIs (#3) should use it.  It seems perfect to be able to write exhaustive catches in this situation, since everything in knowable. OTOH, this could encourage abuse of error handling in cases where you really should return an enum instead of using throws.
>>>>>> >> - Some people are concerned that introducing typed throws would cause people to reach for it instead of using untyped throws for public package APIs.
>>>>>> >
>>>>>> > Even for non-public code.  The only practical merit of typed throws I have ever seen someone demonstrate is that it would let them use contextual lookup in a throw or catch.  People always say "I'll be able to exhaustively switch over my errors", and then I ask them to show me where they want to do that, and they show me something that just logs the error, which of course does not require typed throws.  Every.  Single.  Time.
>>>>>> 
>>>>>> I agree that exhaustive switching over errors is something that people are extremely likely to actually want to do.  I also think it's a bit of a red herring.  The value of typed errors is *not* in exhaustive switching.  It is in categorization and verified documentation.
>>>>>> 
>>>>>> Here is a concrete example that applies to almost every app.  When you make a network request there are many things that could go wrong to which you may want to respond differently:
>>>>>> * There might be no network available.  You might recover by updating the UI to indicate that and start monitoring for a reachability change.
>>>>>> * There might have been a server error that should eventually be resolved (500).  You might update the UI and provide the user the ability to retry.
>>>>>> * There might have been an unrecoverable server error (404).  You will update the UI.
>>>>>> * There might have been a low level parsing error (bad JSON, etc).  Recovery is perhaps similar in nature to #2, but the problem is less likely to be resolved quickly so you may not provide a retry option.  You might also want to do something to notify your dev team that the server is returning JSON that can't be parsed.
>>>>>> * There might have been a higher-level parsing error (converting JSON to model types).  This might be treated the same as bad JSON.  On the other hand, depending on the specifics of the app, you might take an alternate path that only parses the most essential model data in hopes that the problem was somewhere else and this parse will succeed.
>>>>>> 
>>>>>> All of this can obviously be accomplished with untyped errors.  That said, using types to categorize errors would significantly improve the clarity of such code.  More importantly, I believe that by categorizing errors in ways that are most relevant to a specific domain a library (perhaps internal to an app) can encourage developers to think carefully about how to respond.
>>>>> 
>>>>> I used to be rather in favor of adding typed errors, thinking that it can only benefit and seemed reasonable. However, given the very interesting discussion here, I'm inclined to think that what you articulate above is actually a very good argument _against_ adding typed errors.
>>>>> 
>>>>> If I may simplify, the gist of the argument advanced by Tino, Charlie, and you is that the primary goal is documentation, and that documentation in the form of prose is insufficient because it can be unreliable. Therefore, you want a way for the compiler to enforce said documentation. (The categorization use case, I think, is well addressed by the protocol-based design discussed already in this thread.)
>>>> 
>>>> Actually documentation is only one of the goals I have and it is the least important.  Please see my subsequent reply to John where I articulate the four primary goals I have for improved error handling, whether it be typed errors or some other mechanism.  I am curious to see what you think of the goals, as well as what mechanism might best address those goals.
>>> 
>>> Your other three goals have to do with what you term categorization, unless I misunderstand. Are those not adequately addressed by Joe Groff's protocol-based design?
>> 
>> Can you elaborate on what you mean by Joe Gross’s protocol-based design?  I certainly haven’t seen anything that I believe addresses those goals well.
>> 
>>>  
>>>>> 
>>>>> However, the compiler itself cannot reward, only punish in the form of errors or warnings; if exhaustive switching is a red herring and the payoff for typed errors is correct documentation, the effectiveness of this kind of compiler enforcement must be directly proportional to the degree of extrinsic punishment inflicted by the compiler (since the intrinsic reward of correct documentation is the same whether it's spelled using doc comments or the type system). This seems like a heavy-handed way to enforce documentation of only one specific aspect of a throwing function; moreover, if this use case were to be sufficiently compelling, then it's certainly a better argument for SourceKit (or some other builtin tool) to automatically generate information on all errors thrown than for the compiler to require that users declare it themselves--even if opt-in.
>>>>> 
>>>>> 
>>>>>> Bad error handling is pervasive.  The fact that everyone shows you code that just logs the error is a prime example of this.  It should be considered a symptom of a problem, not an acceptable status quo to be maintained.  We need all the tools at our disposal to encourage better thinking about and handling of errors.  Most importantly, I think we need a middle ground between completely untyped errors and an exhaustive list of every possible error that might happen.  I believe a well designed mechanism for categorizing errors in a compiler-verified way can do exactly this.
>>>>>> 
>>>>>> In many respects, there are similarities to this in the design of `NSError` which provides categorization via the error domain.  This categorization is a bit more broad than I think is useful in many cases, but it is the best example I'm aware of.
>>>>>> 
>>>>>> The primary difference between error domains and the kind of categorization I am proposing is that error domains categorize based on the source of an error whereas I am proposing categorization driven by likely recovery strategies.  Recovery is obviously application dependent, but I think the example above demonstrates that there are some useful generalizations that can be made (especially in an app-specific library), even if they don't apply everywhere.
>>>>>> 
>>>>>> > Sometimes we then go on to have a conversation about wrapping errors in other error types, and that can be interesting, but now we're talking about adding a big, messy feature just to get "safety" guarantees for a fairly minor need.
>>>>>> 
>>>>>> I think you're right that wrapping errors is tightly related to an effective use of typed errors.  You can do a reasonable job without language support (as has been discussed on the list in the past).  On the other hand, if we're going to introduce typed errors we should do it in a way that *encourages* effective use of them.  My opinion is that encouraging effect use means categorizing (wrapping) errors without requiring any additional syntax beyond the simple `try` used by untyped errors.  In practice, this means we should not need to catch and rethrow an error if all we want to do is categorize it.  Rust provides good prior art in this area.
>>>>>> 
>>>>>> >
>>>>>> > Programmers often have an instinct to obsess over error taxonomies that is very rarely directed at solving any real problem; it is just self-imposed busy-work.
>>>>>> 
>>>>>> I agree that obsessing over intricate taxonomies is counter-productive and should be discouraged.  On the other hand, I hope the example I provided above can help to focus the discussion on a practical use of types to categorize errors in a way that helps guide *thinking* and therefore improves error handling in practice.
>>>>>> 
>>>>>> >
>>>>>> >> - Some people think that while it might be useful in some narrow cases, the utility isn’t high enough to justify making the language more complex (complexity that would intrude on the APIs of result types, futures, etc)
>>>>>> >>
>>>>>> >> I’m sure there are other points in the discussion that I’m forgetting.
>>>>>> >>
>>>>>> >> One thing that I’m personally very concerned about is in the systems programming domain.  Systems code is sort of the classic example of code that is low-level enough and finely specified enough that there are lots of knowable things, including the failure modes.
>>>>>> >
>>>>>> > Here we are using "systems" to mean "embedded systems and kernels".  And frankly even a kernel is a large enough system that they don't want to exhaustively switch over failures; they just want the static guarantees that go along with a constrained error type.
>>>>>> >
>>>>>> >> Beyond expressivity though, our current model involves boxing thrown values into an Error existential, something that forces an implicit memory allocation when the value is large.  Unless this is fixed, I’m very concerned that we’ll end up with a situation where certain kinds of systems code (i.e., that which cares about real time guarantees) will not be able to use error handling at all.
>>>>>> >>
>>>>>> >> JohnMC has some ideas on how to change code generation for ‘throws’ to avoid this problem, but I don’t understand his ideas enough to know if they are practical and likely to happen or not.
>>>>>> >
>>>>>> > Essentially, you give Error a tagged-pointer representation to allow payload-less errors on non-generic error types to be allocated globally, and then you can (1) tell people to not throw errors that require allocation if it's vital to avoid allocation (just like we would tell them today not to construct classes or indirect enum cases) and (2) allow a special global payload-less error to be substituted if error allocation fails.
>>>>>> >
>>>>>> > Of course, we could also say that systems code is required to use a typed-throws feature that we add down the line for their purposes.  Or just tell them to not use payloads.  Or force them to constrain their error types to fit within some given size.  (Note that obsessive error taxonomies tend to end up with a bunch of indirect enum cases anyway, because they get recursive, so the allocation problem is very real whatever we do.)
>>>>>> >
>>>>>> > John.
>>>>>> 
>>>>>> _______________________________________________
>>>>>> swift-evolution mailing list
>>>>>> swift-evolution at swift.org
>>>>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170818/fe79563c/attachment.html>


More information about the swift-evolution mailing list