I've been handling docker images stored in our nexus repository by using cleanup policies. these are good for basic behavior, configured in the tasks that run daily (or hourly or w.e you want) like so:
The cleanup policy has a regex, to avoid deleting a certain image tagged in a certain way (eg: build-latest), and a last downloaded at (eg: 5 days).
Now this helps deleting images every X days but some images needed to be kept as long as no other exist, i.e if the only image that exist is build-99 do not delete it, which is something I couldn't do with only policies.
how the repo looks like for what I want to achieve:
my-repository
is just a folder name that by default takes the repository name, its just to demonstrate.
so how do you manage this ?
note: information specified on what was done here can be found in different SO posts or github
Using a groovy script that is run automatically everyday I was able to do this. The script is set in a task of Admin - Execute script which is disabled by default in nexus newer version, which I solved following Scripting Nexus Repository Manager 3 in the FAQ Section, aswell as How to Determine the Location of the Nexus 3 Data Directory.
The script is based on documentation, issues, and code from different places (eg: StorageTxImpl.java is where you can find methods that fetch/delete assets, components, etc). It was inspired by these aswell Using the Nexus3 API how do I get a list of artifacts in a repository, NEXUS-14837 and Nexus 3 Groovy Script development environment setup
The script:
The script must be run before the second task (i.e equal to the first, before or after doesn't matter). the policies were also no longer needed so they were no longer assigned to the repository.
how it works or what it does:
last_downloaded
and keep only the ones not matching the most recent 3 for egdeleteComponent(cp)
internally deletes the assets and their blobs)note: I saw scripts can be parameterized but it was not needed in my case
note: this can be updated to loop all repositories but I just needed one
import org.sonatype.nexus.repository.storage.Asset
import org.sonatype.nexus.repository.storage.Query
import org.sonatype.nexus.repository.storage.StorageFacet
import groovy.json.JsonOutput
import groovy.json.JsonSlurper
import org.sonatype.nexus.repository.Repository
class RepositoryProcessor {
private final log
private final repository
private final String repoName = 'my-repository'
private final String[] ignoreVersions = ['build-latest']
private final int processIfSizeGt = 3
private final int delAllButMostRecentNImages = 2
RepositoryProcessor(log, repository) {
this.log = log
this.repository = repository
}
void processRepository() {
def repo = repository.repositoryManager.get(repoName)
log.debug("found repository: {}", repo)
// will use default of sonatype
// https://github.com/sonatype/nexus-public/blob/master/components/nexus-repository/src/main/java/org/sonatype/nexus/repository/storage/StorageFacetImpl.java
StorageFacet storageFacet = repo.facet(StorageFacet)
log.debug("initiated storage facet: {}", storageFacet.toString())
// tx of type https://github.com/sonatype/nexus-public/blob/master/components/nexus-repository/src/main/java/org/sonatype/nexus/repository/storage/StorageTxImpl.java $$EnhancerByGuice ??
def transaction = storageFacet.txSupplier().get()
log.debug("initiated transaction instance: {}", transaction.toString())
try {
transaction.begin()
log.info("asset count {}", transaction.countAssets(Query.builder().build(), [repo]))
log.info("components count {}", transaction.countComponents(Query.builder().build(), [repo]))
// queried db is orientdb, syntax is adapted to it
def components = transaction.findComponents(Query.builder()
// .where("NOT (name LIKE '%service-A%')")
// .and("NOT (name LIKE '%service-B%')")
.build(), [repo])
// cp and cpt refers to component
// group by name eg: repository/my-repository/some-project/service-A
def groupedCps = components.groupBy{ it.name() }.collect()
// fetch assets for each cp
// and set them in maps to delete the old ones
groupedCps.each{ cpEntry ->
// process only if its greater than the minimum amount of images per service
if (cpEntry.value.size > processIfSizeGt) {
// single component processing (i.e this would be done for each service)
def cpMap = [:] // map with key eq id
def cpAssetsMap = [:] // map of cp assets where key eq cp id
// process service cpts
cpEntry.value.each { cp ->
// cp id of type https://github.com/sonatype/nexus-public/blob/master/components/nexus-orient/src/main/java/org/sonatype/nexus/orient/entity/AttachedEntityId.java
def cpId = cp.entityMetadata.id.identity
// asset of type: https://github.com/sonatype/nexus-public/blob/master/components/nexus-repository/src/main/java/org/sonatype/nexus/repository/storage/Asset.java
def cpAssets = transaction.browseAssets(cp).collect()
// document of type https://github.com/joansmith1/orientdb/blob/master/core/src/main/java/com/orientechnologies/orient/core/record/impl/ODocument.java
// _fields of type: https://github.com/joansmith1/orientdb/blob/master/core/src/main/java/com/orientechnologies/orient/core/record/impl/ODocumentEntry.java
// any field is of type ODocumentEntry.java
// append to map if it does not belong to the ignored versions
if (!(cp.entityMetadata.document._fields.version.value in ignoreVersions)) {
cpMap.put(cpId, cp)
cpAssetsMap.put(cpId, cpAssets)
}
}
// log info about the affected folder/service
log.info("cp map size: {}, versions: {}",
cpMap.values().size(),
cpMap.values().entityMetadata.document._fields.version.value)
// order desc by last_downloaded (default is asc)
log.debug("cp map assets of size: {}", cpAssetsMap.values().size())
def sortedFilteredList = cpAssetsMap.values()
.sort { it.entityMetadata.document._fields.last_downloaded?.value[0] } // extract Date element using [0]
.reverse(true)
.drop(delAllButMostRecentNImages)
// list of cp ids from the assets that going to be deleted
def sortedAssetsCps = sortedFilteredList.entityMetadata.document._fields.component?.value?.flatten()
log.info("cp map assets size after filtering {}", sortedFilteredList.size())
// this will print the cps ids to delete
log.debug("elements to delete : sorted assets cps list {}", sortedAssetsCps)
// deleting components and their assets
cpMap.findAll { it.key in sortedAssetsCps }
.each { entry ->
log.info("deleting cp version {}", entry.value.entityMetadata.document._fields.version?.value)
// this will call delete asset internally, and by default will delete blob
transaction.deleteComponent(entry.value)
}
}
}
transaction.commit();
} catch (Exception e) {
log.warn("transaction failed {}", e.toString())
transaction.rollback()
} finally {
transaction.close();
}
}
}
new RepositoryProcessor(log, repository).processRepository()