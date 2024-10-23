No more steps required to run the tools on on-premises environment including standalone/local machines.

The tools CLI depends on Python implementation of PyArrow which relies on some environment variables to bind with HDFS:

HADOOP_HOME : the root of your installed Hadoop distribution. Often has “ lib/native/libhdfs.so ”.

JAVA_HOME : the location of your Java SDK installation.

ARROW_LIBHDFS_DIR (optional): explicit location of “ libhdfs.so ” if it’s installed somewhere other than $HADOOP_HOME/lib/native .

Add the Hadoop jars to your CLASSPATH . Linux export CLASSPATH = ` $HADOOP_HOME /bin/hadoop classpath --glob ` Windows %HADOOP_HOME%/bin/hadoop classpath --glob > %CLASSPATH%

For more information on HDFS requirements, refer to the PyArrow HDFS documentation