I have a git repo using annex that contains a symlink to another volume/mount point. I'd like the external link to be treated like a file by annex. but git annex copy
ignores it.
Can git annex track external symbolic links? It looks like git-annex-import has the desired behavior.
The file I want annex to copy is a symbolic link crossing filesystems (cannot replace symbolic link with hard link)
cd /Volumes/Phillips/mMR_PETDA/scripts/bids/datalad-bids/ds002385
readlink sub-10195/ses-1/func/sub-10195_ses-1_task-rest_run-1_bold.nii.gz
# /Volumes/Hera/preproc/petrest_rac1/brnsuwdktmp_rest/10195_20160317/func.nii.gz
It is tracked by git
git log --oneline -- sub-10195/ses-1/func/sub-10195_ses-1_task-rest_run-1_bold.nii.gz
# 3f987f1 bold: fix session (replace date with timepoint)
# double check
git annex add sub-10195/ses-1/func/sub-10195_ses-1_task-rest_run-1_bold.nii.gz
git annex status
# empty
but has no annex log and will not copy
git annex copy sub-10195/ses-1/func/sub-10195_ses-1_task-rest_run-1_bold.json --to openneuro --verbose
# empty
git annex log -- sub-10195/ses-1/func/sub-10195_ses-1_task-rest_run-1_bold.nii.gz
# empty
I have a much smaller file I'm comfortable copying instead of linking.
git log --oneline -- sub-11299/ses-1/anat/sub-11299_ses-1_acq-1ADNIG2_T1w.nii.gz
# 2f3e00e fix sub-11299_ses-1 t1
git annex copy sub-11299/ses-1/anat/sub-11299_ses-1_acq-1ADNIG2_T1w.nii.gz --to openneuro --verbose
# copy sub-11299/ses-1/anat/sub-11299_ses-1_acq-1ADNIG2_T1w.nii.gz ok
git annex log -- sub-11299/ses-1/anat/sub-11299_ses-1_acq-1ADNIG2_T1w.nii.gz|cat
# + Fri, 14 Apr 2023 15:08:33 EDT sub-11299/ses-1/anat/sub-11299_ses-1_acq-1ADNIG2_T1w.nii.gz | 9b0d89bd-119d-4867-9997-98d11bd6842c -- [openneuro]
md5sum sub-10195/ses-1/func/sub-10195_ses-1_task-rest_run-1_bold.nii.gz sub-11299/ses-1/anat/sub-11299_ses-1_acq-1ADNIG2_T1w.nii.gz
# 11de4921c66bb02d15c89da4d4e6b27e sub-10195/ses-1/func/sub-10195_ses-1_task-rest_run-1_bold.nii.gz
# 34238205862fac51273b57fa44beec62 sub-11299/ses-1/anat/sub-11299_ses-1_acq-1ADNIG2_T1w.nii.gz
# func/rest not
find .git/ -name 'MD5E-*11de4921c66bb02d15c89da4d4e6b27e*' -print -quit
# no results
# T1w file in annex
find .git/ -name 'MD5E-*34238205862fac51273b57fa44beec62*' -print -quit
# .git/annex/objects/4M/gZ/MD5E-s10792493--34238205862fac51273b57fa44beec62.nii.gz
# only T1w in annex objects
tree .git/annex/objects
.git/annex/objects
└── 4M
└── gZ
└── MD5E-s10792493--34238205862fac51273b57fa44beec62.nii.gz
└── MD5E-s10792493--34238205862fac51273b57fa44beec62.nii.gz
It is not possible to have git-annex act on files that are referenced by mere symlinks. git-annex uses symlinks (in general, but not exclusively) as a technical measure to identify file content associated with items/files in a Git tree. More-or-less these links are not more than a record of the git-annex key name representing the identifier for the git-annex internal file availability record (git-annex's knowledge about where this particular file content is available).
If the intent here is to avoid duplication of storage and slow-down due to avoidable copying of large file content, it may be best to consider git-annex's local caching capabilities described at https://git-annex.branchable.com/tips/local_caching_of_annexed_files