SISU (Super Ingenious Sound Upscaler) is an experimental audio upscaler neural network
Go to file
2025-04-30 23:47:40 +03:00
.gitignore 🔥 | Removed unnecessary models. 2024-12-21 23:28:34 +02:00
AudioUtils.py ⚗️ | Experimenting still... 2024-12-25 00:09:57 +02:00
data.py :albemic: | Tests. 2025-03-25 19:50:51 +02:00
discriminator.py | Implemented MFCC and STFT. 2025-04-26 17:03:28 +03:00
file_utils.py ♻️ | Restructured procject code. 2025-04-14 17:51:34 +03:00
generator.py | Implemented MFCC and STFT. 2025-04-26 17:03:28 +03:00
LICENSE Initial commit 2024-12-16 18:03:47 +02:00
README.md :albemic: | Fat architecture. Hopefully better results. 2025-04-06 00:05:43 +03:00
requirements.txt :albemic: | Fat architecture. Hopefully better results. 2025-04-06 00:05:43 +03:00
training_utils.py | Implemented MFCC and STFT. 2025-04-26 17:03:28 +03:00
training.py | Implemented MFCC and STFT. 2025-04-26 17:03:28 +03:00

SISU

Overview

SISU (Super Ingenious Sound Upscaler) is a project that uses GANs (Generative Adversarial Networks) to make low-quality audio better. The goal is to take not-so-good-sounding audio and turn it into high-quality, clear audio.

Structure of the Project

  • dataset: This folder has some sample audio files for testing.
  • models:
    • generator.py: This file has the code for the part that improves the audio.
    • discriminator.py: This file has the code for the part that checks if the audio is good or not.
  • training:
    • training.py: This script is used to teach the computer how to improve the audio.

Using the Project

  1. Set Up:

    • Make sure you have Python installed (version 3.8 or higher).
    • Install needed packages: pip install -r requirements.txt
    • Install current version of PyTorch (CUDA/ROCm/What ever your device supports)
  2. Prepare Audio Data:

    • Put your audio files in the dataset/good folder.
  3. Train the Model:

    • Run the training script: python training.py
  4. Generate Better Audio:

    • After training, you can use the generator to make your audio sound better.

License

This project is open-source and licensed under the GPLv3 License. For details, see the LICENSE file.