File Classifier
FileClassifier
FileClassifier is a Python-based command-line tool that automatically organizes files in a specified directory into predefined categories based on their file types. The tool supports multiple file formats, such as images, documents, videos, and more. It also comes with an extendable classifier that allows you to add topic modeling for better organization.
Features
- Automatic file organization based on file types.
- Predefined categories for common file formats.
- Extendable classifier with topic modeling support.
- Customizable output directory structure.
- Lightweight and easy to use.
Installation
To install FileClassifier, simply clone the repository and install the required dependencies:
git clone https://github.com/antoinebou12/FileClassifier.git
cd FileClassifier
pip install poetry
poetry install
Usage
To use FileClassifier, navigate to the project directory and run the main.py script with the required arguments:
python main.py [OPTIONS] INPUT_DIRECTORY OUTPUT_DIRECTORY
Arguments
INPUT_DIRECTORY: The directory containing the files to be organized.OUTPUT_DIRECTORY: The directory where the organized files will be moved to.
Options
--version: Show the version and exit.--types: List supported file types and their categories.--topic-modeling: Enable topic modeling for better organization (requires additional setup).
Example
python main.py /path/to/input /path/to/output
This command will organize the files in /path/to/input and move them to the appropriate folders in /path/to/output.
Extending the Classifier
To add topic modeling to the classifier, you need to modify the Classifier.py script and install additional dependencies. Please refer to the script comments and the provided documentation for more information on how to implement topic modeling.
Contributing
Contributions are welcome! If you would like to contribute to FileClassifier, please submit a pull request or open an issue with your ideas and suggestions.
License
FileClassifier is released under the MIT License. See the LICENSE file for more information.