h.mei created page: home authored by Mei's avatar Mei
# Use git-annex to transfer large data files # Use git-annex to transfer large data files
Big data files in git are handled by git annex. Thus those big files themselves are not commited to the git-lab webserver or other git repositories unless we use git annex commands to explicitly do so. Big data files in git are handled by git annex. Thus those big files themselves are not commited to the git-lab webserver or other git repositories unless we use git annex commands to explicitly do so.
When a project is closed and we want to transfer files to the project requester, we should use the git annex commands described here for a largely automated procedure. When a project is closed and we want to transfer files to the project requester, we should use the git annex commands described here for a largely automated procedure.
...@@ -7,7 +9,8 @@ Most of time for a project requester, it is perferred to access files directly i ...@@ -7,7 +9,8 @@ Most of time for a project requester, it is perferred to access files directly i
## Sample procedure ## Sample procedure
1. Init a git repository on the target file system, such as a USB harddisk 1. Init a git repository on the target file system, such as a USB harddisk.
``` ```
hmei@Leon-LUMC:/media/hmei/3E4A-01AF$ mkdir annex hmei@Leon-LUMC:/media/hmei/3E4A-01AF$ mkdir annex
hmei@Leon-LUMC:/media/hmei/3E4A-01AF$ cd annex/ hmei@Leon-LUMC:/media/hmei/3E4A-01AF$ cd annex/
...@@ -27,6 +30,7 @@ ok ...@@ -27,6 +30,7 @@ ok
``` ```
2. Use the project folder as remote repository and sync it. 2. Use the project folder as remote repository and sync it.
``` ```
hmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$ git remote add laptop ~/git/source_test_annex/ hmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$ git remote add laptop ~/git/source_test_annex/
hmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$ git annex sync laptop hmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$ git annex sync laptop
...@@ -69,12 +73,14 @@ drwx------ 2 hmei hmei 4096 mrt 26 21:18 bam ...@@ -69,12 +73,14 @@ drwx------ 2 hmei hmei 4096 mrt 26 21:18 bam
``` ```
3. Since the USB disk uses FAT, by default git-annex is using direct mode. Symlinks are stored as file content. 3. Since the USB disk uses FAT, by default git-annex is using direct mode. Symlinks are stored as file content.
``` ```
hmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$ cat cool_big.bam hmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$ cat cool_big.bam
.git/annex/objects/Mm/7V/SHA256E-s414416896--8480c5bf05b9e57663e231f6ce373ff17cf3f771f5ba2b6f63700b7287a6f2b4.bam/SHA256E-s414416896--8480c5bf05b9e57663e231f6ce373ff17cf3f771f5ba2b6f63700b7287a6f2b4.bamhmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$ .git/annex/objects/Mm/7V/SHA256E-s414416896--8480c5bf05b9e57663e231f6ce373ff17cf3f771f5ba2b6f63700b7287a6f2b4.bam/SHA256E-s414416896--8480c5bf05b9e57663e231f6ce373ff17cf3f771f5ba2b6f63700b7287a6f2b4.bamhmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$
``` ```
4. Now we can get the actual files 4. Now we can get the actual files
``` ```
hmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$ git annex get . hmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$ git annex get .
get bam/small (from laptop...) get bam/small (from laptop...)
...@@ -93,6 +99,7 @@ drwx------ 2 hmei hmei 4096 mrt 26 21:18 bam ...@@ -93,6 +99,7 @@ drwx------ 2 hmei hmei 4096 mrt 26 21:18 bam
``` ```
5. Don't forget to check the file integrity of the entire repository. 5. Don't forget to check the file integrity of the entire repository.
``` ```
hmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$ git annex fsck hmei@Leon-LUMC:/media/hmei/3E4A-01AF/annex$ git annex fsck
fsck bam/small (checksum...) fsck bam/small (checksum...)
... ...
......