Re-implement tar support using native tarfile module

Hello ansible-dev team,

Hope you are doing well. Though I do not use ansible, I started looking into the source code and learnt alot on how to deal with certain automation tasks. The ansible team in IRC were also really helpful when I wanted to find something in the source code. And at PyCon 2016, one of the core developers (Toshio) gave me a good explanation and starting point on how ansible works internally. At work, I had to use the tarfile module for a task with fabric, so I spent time looking into the tarfile module while at the same time look at how ansible handles the unarchive action to think of important use cases. Toshio mentioned in PyCon and it is also in the unarchivpy file (https://github.com/ansible/ansible-modules-core/blob/devel/files/unarchive.py#L93) that doing tar support with the tarfile module is in the to-do list.

I was wondering if I could work on this task? I know it may not be easy at first especially since it is more of a feature and not a bug fix, but I was hoping I could try. And I was also hoping if one of the core developers can help me out as a mentor in case I have questions or I wanted to make sure i dont overlook anything. That would actually make the task hopefully simpler and more fun because it will allow me to understand ansible better.

If this is all possible, I wanted to ask some questions that I was thinking about this task:

  1. Which python versions to take care of? For this module, are we expected to support from Python 2.4? And uptill which Python 3 version? There are some differences in Python 2 and Python 3

  2. Nothing for xz in tarfile in Python < 2.6. In Python 2.7, there is lzma module; In Python 3,support for xz is included in tarfile module (lzma module is imported in tarfile module)

  3. I am not sure how to handle is_unarchived method with the tarfile module. I did not really find anything in tarfile module which can be used to check if file is already unarchived.

  4. W.r.t point 2 and 3, maybe use tarfile module where possible, and use bin path where tarfile module cannot help. Also for xz, maybe add PY2.7 check and PY3 check to add them separately, and for less than 2.6, use the bin path. I know this can be messy so I am open to ideas.

  5. Maybe use urlparse module to detect urls rather than relying on “‘://’ in src”

  6. use tarfile.is_tarfile method to check if given file is a valid tarfile. The problem with this is that it will return true for empty files (partially fixed in Python 3 but still may return True in some cases). But you already take care of the case of emtpy src files provided.

Let me know what you think. I understand if you think this maybe too much for a new contributor but I would really like to give it a try. And it would really if there is a go-to person just for clarifying confusions or discussing alternative approaches.

Usman