git sparse-checkout

Alex_Rodenberg · October 14, 2013, 11:36am

Hi

Just wondering if anyone else is doing git sparse-checkouts in ansible playbooks ?

Maybe they have a better idea than me…

The way to do it is this:

mkdir && cd
git init
git remote add –f

Enable sparse-checkout in the repo:
git config core.sparsecheckout true

Configure sparse-checkout by listing your desired sub-trees in .git/info/sparse-checkout:

echo some/dir/ >> .git/info/sparse-checkout
echo another/sub/tree >> .git/info/sparse-checkout

Checkout from the remote:

git pull

If repo was already checked out before sparse checkout was done, do:

git read-tree -mu HEAD

The way I’m writing my playbook is this:

tasks:

name: getting my git repo

git: repo=gitolite@my.git dest=/bla remote=origin version=production

name: enable sparse-checkout
shell: git config core.sparsecheckout true chdir=/bla creates=/bla/.git/info/sparse-checkout
notify: sparse-checkout
name: reset folder to only contain sparse structure if it was previously created
shell: git read-tree -mu HEAD chdir=/bla

handlers:

name: sparse-checkout
copy: files/sparse-checkout dest=/bla/.git/info/sparse-checkout

The thing I don’t like about this, is I have to checkout my git repo before hand if I want to use the Ansible module, otherwise I need to do custom commands to init the repo myself, or am I missing something?

If anyone else are doing this please show me if it can be done better.

Michael_DeHaan2 · October 14, 2013, 12:35pm

This is not something the module currently has implemented as a feature.

In fact, this is the first I’m aware of git having this feature.

I’m open to pull requests that attempt to add a nice syntax for doing this to the git module, but I’m not sure there would be a nice one, and generally suspect this would be infrequently used.

Peter_Gehres · October 15, 2013, 12:25am

FWIW, I’ve only ever had issues with sparse checkout. For some reason they just stop working sometimes. YMMV, but unless you really need a sparse checkout, I would avoid it.

Alex_Rodenberg · October 18, 2013, 12:57pm

we are moving away from it, but for now it is needed…

Michael_DeHaan2 · October 18, 2013, 1:27pm

I’m interested in learning more about why someone would need it to decide whether to entertain the pull request or suggest other workarounds.

Let me know if you can!

Peter_Gehres · October 21, 2013, 10:22pm

I could possibly see it if you have a massive repo and only need one or two files and are short on space. Generally a shallow checkout is enough here.

Sparse checkouts are a somewhat hacked feature into git.

Alex_Rodenberg · October 29, 2013, 3:04pm

purely due to low bandwidth machines that we are trying to not download to much extra bloat on.

Although I do download it completely the first time, after sparse checkout is on, it only downloads the content that is specified.

We have a huge “local” www templating system, with different skins/themes, that are not needed on all sites.

We are luckily moving away from storing it in git soon, as it is not really ideal for this.

Jamee_Mikell · October 31, 2013, 7:30pm

Another use case for sparse checkout is for website deployment. The development repository has the site code plus Vagrant file and development configuration files, deployment scripts, etc. so the environment can be shared or recreated easily. For deployment, you only want to pull the site code (a directory) from the project repo to the server.

Submodules don’t solve this well because the site code is changing during development requiring double commits (in the site-only repo and in the whole-project repo).

Subtrees require a separate repo for the site code and duplicate copies of code (in the site-only repo and the whole-project repo). Also, subtrees don’t move with the repo, so you have to reestablish them when you pull the project repo–and know to do it. And that introduces an opportunity for code to get out of sync between the two repos.

Sparse checkout keeps all the project resources in one repo (whole-project) and only pulls the site code to the staging and production servers for deploy.

(If anyone has better solutions for this scenario, I’m all ears.)

Michael_DeHaan2 · November 1, 2013, 12:17am

I believe I heard the feedback recently (I didn’t dig in) that spare checkouts were a bit of a hack, but someone was quite intereted (and we have a pull request) for bare repos, which are checkouts without the .git
directory.

That being said, I’m probably open to both, but if there are any caveats documented on the git documentation about when they might explode, we should link to them.

Jamee_Mikell · November 1, 2013, 1:02pm

FWIW, sparse checkout is documented at the bottom of http://git-scm.com/docs/git-read-tree.

The only issue documented seems to be if you decide you want the whole repo at a later date, you have to force it.

Sparse checkout was originally developed to provide SVN-like functionality (externals or shallow checkouts depending on who you read) that some people found useful. http://vmiklos.hu/blog/sparse-checkout-example-in-git-1-7

Some people may consider it a hack because it requires a repo before you can set it up, but if you clone the repo, your checkout isn’t sparse. So instead of simply checking out the remote repo with a parameter that lists the parts you want, sparse checkout requires you to git init the repo, set up sparse checkout, repoint the origin, and then pull code .

Jamee_Mikell · November 1, 2013, 3:40pm

After a bit more research, sparse checkout in one line:

git clone --template=path/to/template-directory --config core.sparsecheckout=true

where path/to/template-directory/info/sparse-checkout becomes .git/info/sparse-checkout.

Michael_DeHaan2 · November 2, 2013, 2:50am

make it so!

Jamee_Mikell · November 2, 2013, 5:02pm

I have working code for this including documentation updates in the header and a playbook I used to test it (rsync template-directory to remote, clone github repo with template and config to do sparse checkout). Reading the “contributing” guide, I’m not sure what “make tests” does besides run 0 tests–probably because I don’t know how to set up tests in nose. I’m not a Python programmer, just know enough to mimic similar code that’s in the git module already and make it work.

Should I submit a pull or is there a guide for setting up the tests?

Thanks.

Michael_DeHaan2 · November 3, 2013, 4:43pm

“I’m not sure what “make tests” does besides run 0 tests–probably because I don’t know how to set up tests in nose.”

It runs quite a bit more than 0.

You have to install nosetests

“yum search nose” # etc

However, the focus of the unit tests are not module coverage – that’s more for an integration testing effort.

Jamee_Mikell · November 19, 2013, 10:51pm

Debian here. But I installed nose and it ran, but reported no tests found.

Pull request below includes link to test playbook.

https://github.com/ansible/ansible/pull/4923

Topic		Replies	Views
Local repo path in tmeplate Get Help	4	142	February 16, 2024
Playbook for git deployments Ansible Project	7	4	October 7, 2013
Using ansible-pull with dev branch Ansible Project	2	12	October 16, 2012
git checkout to new branch and change back to local user Ansible Project	6	11	March 17, 2018
git remote ansible Ansible Project	0	1	September 7, 2016

git sparse-checkout

Related topics