in python hdfs Is there a way to use wildcard or regex in the list method?

527 views Asked by At

In linux hadoop fs -ls I can use wildcard (/sandbox/*) but the pyhon hdfs client list method fails on this as an unknown path. Is there a different way to use wildcards in python-hdfs?

1

There are 1 answers

0
Ezer K On BEST ANSWER

Found this which uses os.walk with fnmatch, and adopted it to hadoop_client.

here is an example for finding csv files:

for root, dirs, files in hc.walk(Path):
    for filename in fnmatch.filter(files, '*.csv'):
        print(os.path.join(root, filename))