hadoop - Is there a way to set a TTL for certain directories in HDFS? -


i have following requirements. adding date-wise data specific directory in hdfs, , need keep backup of last 3 sets, , remove rest. there way set ttl directory data perishes automatically after number of days?

if not, there way achieve similar results?

this feature not yet available on hdfs.

there jira ticket created support feature: https://issues.apache.org/jira/browse/hdfs-6382

but, fix not yet available.

you need handle using cron job. can create job (this simple shell, perl or python script), periodically deletes data older pre-configured period.

this job could:

  • run periodically (for e.g. once hour or once day)
  • take list of folders or files need checked, along ttl input
  • delete file or folder, older specified ttl.

this can achieved easily, using scripting.


Comments

Popular posts from this blog

how to insert data php javascript mysql with multiple array session 2 -

multithreading - Exception in Application constructor -

windows - CertCreateCertificateContext returns CRYPT_E_ASN1_BADTAG / 8009310b -