Readable error message for all humans

Marc_Trudel · June 21, 2014, 3:59pm

Greetings,

Someone at work brought this one to me, and I thought I would put the question out there and see what others do/think about this.

We have a deployment tool which early on transformed itself into a local development environment management tool as well (it provisions a VM according to the configuration and requirements of a project, which can me modified at any time using a configuration file). Works fantastically well, but unlike system managers, developers don’t want to care about error cases. So for required configuration, we go check the data wherever a default is not possible, and print out a human-readable error with some details. However, it happens sometime that the failure is due to a bug in the playbook, or to some manual modifications a user has done on his machine, and so on.

My question would be: is there a proper pattern to print out human-readable errors which would be oriented to a customer and not to someone doing deployments and operation for a living? I am thinking of pushing the tool itself towards less and less technical people (for all sorts of reasons), so for me it would be nice if we had a way to, say “This error should never happen, contact operations” or “This my be caused by a network connectivity problem. Check your internet connection, and please try again” when you try to download something and it fails. I can imagine that the ability to create generic error messages would also come handy.

Cheers!

Ernest0x · June 21, 2014, 5:18pm

Hi, There is no single pattern for system failure causes. Systems can fail in many ways by many causes. However, you can follow a statistical method by analyzing the most common errors caused by user configuration or usage and create a mapping with possible remedies or workarounds. Make sure though that you do not overestimate your guessing for an error cause and do not hide any useful details. You may have historical indications that an error was caused by user misconfiguration when it could be actually a bug. So, I would suggest to always have your tool create a detailed error report for your system engineers, regardless the error.

Michael_DeHaan1 · June 22, 2014, 4:03pm

I’m not sure how this relates to Ansible specifically.

If you can phrase this in terms of improving Ansible error messages in ways that would make better sense for non-technical users, I’m interested in the discussion.

Marc_Trudel · June 23, 2014, 9:02am

Maybe something like:

name: “Some task”
errorMessage: “This task might have failed because of bad network connectivity”
curl: […]

Or something like that.

I have no idea what format would be nice. But I am thinking that it could be nice to list at least some of the potential cause of the error which are known at the time of writing the role or playbook.

Ernest0x · June 23, 2014, 10:30am

So, if I understand you correctly, you are proposing for a way to output supplementary messages as hints to what may have gone wrong when playbook/role tasks fail and possibly what can be done to overcome the error. I think this could be useful for helping users recover from playbook/role-specific error conditions that the playbook/role writer can guess, but the module writer cannot. Do you have any thoughts on the presentation format? P.S. The camel-cased “errorMessage” is surely not a good name for this. I would prefer something like ‘error_hints’ or ‘failure_hints’ that take a list of strings.

Michael_DeHaan1 · June 23, 2014, 12:12pm

We will not be doing this, by the way.

Marc_Trudel · June 23, 2014, 1:45pm

Hum, what do you mean? That it is a bad format, a bad idea overall, or that it will need to come from the open-source community?

Ad for the format, I don’t really care. I can try to think of something better.

Michael_DeHaan1 · June 23, 2014, 2:14pm

I don’t think it’s a very effective idea for Ansible, when there are often thousands of things that could produce a failure. We will share the failure message, but the “why” is something that humans should decipher.

Marc_Trudel · June 24, 2014, 6:28am

We cannot obviously have a meaningful message for all errors, but I think it would be nice to offer something in the ballpark of “you might want to check the following things on your system”. For my use case, I think that would be a start.

Ernest0x · June 24, 2014, 6:42am

BTW, this might not be the most elegant solution for your case, but you could write ‘debug’ tasks with a conditional to run only on failure of preceding tasks to output the meaningful message you want.

Marc_Trudel · June 26, 2014, 1:17am

Seems like this this idea won’t fly very far. Too bad.

Let me rephrase the question then: Ansible in its current state, is there a way to get the error as a data object, from which a parent program would be able to decide on how to either handle or present the error?

Michael_DeHaan1 · June 26, 2014, 2:03am

What do you mean by “parent program” ?

Ansible already returns JSON data from modules, and callback plugins are available.

Marc_Trudel · June 26, 2014, 8:09am

Ah, did not know, I’ll take a look.

Topic		Replies	Views
soliciting feedback on ideas about playbook error handling in regards to code deployment Ansible Project	10	0	November 28, 2012
Correct way to reduce useless information Ansible Project	0	4	March 17, 2016
on_failure error handlers - any ideas Ansible Project	1	11	July 7, 2014
Customized error message on failed task? Ansible Project	2	1	August 30, 2017
Raising errors in inventory scripts Ansible Developer	0	1	October 18, 2014

Readable error message for all humans

Related topics