Make parsing of result [Baby] JSON more resilient to garbage?

Before we consider this, what kind of SSH daemon or setting is this that sends MOTD every single time?

I have some ideas, but it seems so non-standard that it's almost not worth including the code hack. But I could be persuaded...

Hi,

Before we consider this, what kind of SSH daemon or setting is this that sends MOTD every single time?

Dropbear 0.48.1-1 (Debian)
(It seems that 0.45 doesn’t have that issue.)

But AFAIK it can be configured to never send MOTD, but that needs recompilation. :frowning:
And there seems to be a way to suppress MOTD by having ~/.hushlogin which didn’t work for me.
And especially I want MOTD. (I know it’s a corner case.)

Thanks,
Dietmar

So this is an embedded device or is some version of Debian using it by default now? I know some tablets or things have used it.

My suggestion is we modify the parse_json function in utils to discard any lines up until the first line that includes the character '{', '[', ' or '='. I think this would also allow us to remove some other hacks in the code like trying to find "tcgetattr" error message noise, and would probably be good for ansible in general. Since all modules are supposed to return a hash or key/value data, this only breaks if your MOTD contains an equals sign.

This should throw any any MOTD contents out until we encounter any JSON or key=value "baby JSON" formats (which are in there to make
it easier to code ansible modules in Bash).

If that makes sense, I'd take a pull request to add this.

--Michael

Hi Michael,

So this is an embedded device or is some version of Debian using it by default now? I know some tablets or things have used it.

It’s a server machine with Debian (no embedded device). And to my knowledge it’s not (or was ever) default on Debian. The guy who did the administration some ages ago was a bit of a fan of Dropbear. (Well, IIRC it was more secure(?) than OpenSSH sshd since it’s much smaller.)

My suggestion is we modify the parse_json function in utils to discard any lines up until the first line that includes the character ‘{’, ‘[’, ’ or ‘=’. I think this would also allow us to remove some other hacks in the code like trying to find “tcgetattr” error message noise, and would probably be good for ansible in general. Since all modules are supposed to return a hash or key/value data, this only breaks if your MOTD contains an equals sign.

This should throw any any MOTD contents out until we encounter any JSON or key=value “baby JSON” formats (which are in there to make
it easier to code ansible modules in Bash).

If that makes sense, I’d take a pull request to add this.

Yes, your suggestion makes sense and that’s surely the better way to do it. (I wasn’t sure if we can just throw away lines not matching the above characters since I didn’t know all internals.)

Since all modules are supposed to return a hash or key/value data, this only breaks if your MOTD contains an equals sign.

This would be solvable I think if we try to parse the next line if one fails or (which I think makes more sense) parse only the last (non blank) line of the output. Since for my understanding the last (non blank) output should be the (module) result anyways. That would make it more resilient I think.

Thanks,
Dietmar

Hi Michael,

> So this is an embedded device or is some version of Debian using it by default now? I know some tablets or things have used it.
It's a server machine with Debian (no embedded device). And to my knowledge it's not (or was ever) default on Debian. The guy who did the administration some ages ago was a bit of a fan of Dropbear. (Well, IIRC it was more secure(?) than OpenSSH sshd since it's much smaller.)

> My suggestion is we modify the parse_json function in utils to discard any lines up until the first line that includes the character '{', '[', ' or '='. I think this would also allow us to remove some other hacks in the code like trying to find "tcgetattr" error message noise, and would probably be good for ansible in general. Since all modules are supposed to return a hash or key/value data, this only breaks if your MOTD contains an equals sign.
>
> This should throw any any MOTD contents out until we encounter any JSON or key=value "baby JSON" formats (which are in there to make
> it easier to code ansible modules in Bash).
>
> If that makes sense, I'd take a pull request to add this.

Yes, your suggestion makes sense and that's surely the better way to do it. (I wasn't sure if we can just throw away lines not matching the above characters since I didn't know all internals.)

> Since all modules are supposed to return a hash or key/value data, this only breaks if your MOTD contains an equals sign.

This would be solvable I think if we try to parse the next line if one fails or (which I think makes more sense) parse only the _last_ (non blank) line of the output. Since for my understanding the last (non blank) output should be the (module) result anyways. That would make it more resilient I think.

We can't do this last part, because there is nothing illegal about multi-line JSON. Trying to throw away leading junk is ok though.

Hi again,

I missed one point:

It’s not only parse_json but also _make_tmp_path (runner/__init__py) where this filtering needs to be done. And at this point lines starting with ‘/’ are allowed (or even essential). (But we could here also apply the rule: last non-blank line only).

Thanks,
Dietmar

Hi Michael,

This would be solvable I think if we try to parse the next line if one fails or (which I think makes more sense)
parse only the last (non blank) line of the output. Since for my understanding the last (non blank)
output should be the (module) result anyways. That would make it more resilient I think.

We can’t do this last part, because there is nothing illegal about multi-line JSON. Trying to throw away leading junk is ok though.

Understood, thank you for the clarification.

Let’s see if I can make your wish come true about pull request.

Thanks,
Dietmar

Hi again,

I missed one point:

It's not only parse_json but also _make_tmp_path (runner/__init__py) where this filtering needs to be done. And at this point lines starting with '/' are allowed (or even essential). (But we could here also apply the rule: last non-blank line only).

This would work there, yes.

The remote md5 function will also need modifications. I would recommend adding a "last_non_blank_line" function to utils.py to do this.

Hi again,

for anyone who needs this: It can be found at https://github.com/SleeplessAnnoyedNerd/ansible.
So far it seems to work pretty well.

Not sure if I like the current implementation… (well, it’s not that bad, but maybe it can be improved, time will show).
Pull request comes as soon as I’m confident about the implementation.

Thanks,
Dietmar