hadoop - Is there a way to set a TTL for certain directories in HDFS? -
i have following requirements. adding date-wise data specific directory in hdfs, , need keep backup of last 3 sets, , remove rest. there way set ttl directory data perishes automatically after number of days?
if not, there way achieve similar results?
this feature not yet available on hdfs.
there jira ticket created support feature: https://issues.apache.org/jira/browse/hdfs-6382
but, fix not yet available.
you need handle using cron job. can create job (this simple shell, perl or python script), periodically deletes data older pre-configured period.
this job could:
- run periodically (for e.g. once hour or once day)
- take list of folders or files need checked, along ttl input
- delete file or folder, older specified ttl.
this can achieved easily, using scripting.
Comments
Post a Comment