|
|
# Use git-annex to transfer large data files
|
|
|
|
|
|
|
|
|
Big data files in git are handled by git annex. Thus those big files themselves are not commited to the git-lab webserver or other git repositories unless we use git annex commands to explicitly do so.
|
|
|
|
|
|
When a project is closed and we want to transfer files to the project requester, we should use the git annex commands described here for a largely automated procedure.
|
... | ... | @@ -7,7 +9,8 @@ Most of time for a project requester, it is perferred to access files directly i |
|
|
|
|
|
## Sample procedure
|
|
|
|
|
|
1. Init a git repository on the target file system, such as a USB harddisk
|
|
|
1. Init a git repository on the target file system, such as a USB harddisk.
|
|
|
|
|
|
```
|
|
|
hmei@Leon-LUMC:/media/hmei/3E4A-01AF$ mkdir annex
|
|
|
hmei@Leon-LUMC:/media/hmei/3E4A-01AF$ cd annex/
|
... | ... | @@ -27,6 +30,7 @@ ok |
|
|
```
|
|
|
|
|
|
2. Use the project folder as remote repository and sync it.
|
|
|
|
|
|
```
|
|
|
hmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$ git remote add laptop ~/git/source_test_annex/
|
|
|
hmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$ git annex sync laptop
|
... | ... | @@ -69,12 +73,14 @@ drwx------ 2 hmei hmei 4096 mrt 26 21:18 bam |
|
|
```
|
|
|
|
|
|
3. Since the USB disk uses FAT, by default git-annex is using direct mode. Symlinks are stored as file content.
|
|
|
|
|
|
```
|
|
|
hmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$ cat cool_big.bam
|
|
|
.git/annex/objects/Mm/7V/SHA256E-s414416896--8480c5bf05b9e57663e231f6ce373ff17cf3f771f5ba2b6f63700b7287a6f2b4.bam/SHA256E-s414416896--8480c5bf05b9e57663e231f6ce373ff17cf3f771f5ba2b6f63700b7287a6f2b4.bamhmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$
|
|
|
```
|
|
|
|
|
|
4. Now we can get the actual files
|
|
|
|
|
|
```
|
|
|
hmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$ git annex get .
|
|
|
get bam/small (from laptop...)
|
... | ... | @@ -93,6 +99,7 @@ drwx------ 2 hmei hmei 4096 mrt 26 21:18 bam |
|
|
```
|
|
|
|
|
|
5. Don't forget to check the file integrity of the entire repository.
|
|
|
|
|
|
```
|
|
|
hmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$ git annex fsck
|
|
|
fsck bam/small (checksum...)
|
... | ... | |