Skip to content

API for PyTorch Extension - Perforated Backpropagation by Perforated AI

Notifications You must be signed in to change notification settings

PerforatedAI/PerforatedAI-API

Repository files navigation

PAI-API

The purpose of this repository is to provide a walkthrough for how to add Perforated AI's Perforated Backpropagationtm to your code. When starting a new project first just add the sections from this README. Once they have been added you can run your code and it will give you errors and warnings about if any "customization" coding is actually required for your architecture. The ways to fix these are in customization.md. Additionally the customization README starts describing alternative options to the recommended settings here. After running your pipeline you can view the graph that shows the correlation values and experiment with the other settings in customization.md that may help get better results.

If you have not done so yet, go to our website and fill out the form to get your RSA license file which is required to run the software. Once we have emailed it to you, put it in the same folder as you run your program from and the system will find it. During beta testing this project will be free for all users, and any networks generated will always be usable without a licence as they are just standard PyTorch models.

Files in the perforatedai folder are to provide information about functions and variables in the actual repository to ease with the process of usage and testing.

We are also trying to get connected into the PyTorch ecosystem. This has a requirement of the GitHub having 300 stars. If you think this is a promising project and would like to support it, please click the star button to help us towards this goal.

1 - Main Script

First install perforatedai with

pip install perforatedai

1.1 - Imports

These are all the imports you will need at the top of your main training file. They will be needed in all of your files that call these functions if some of the below ends up being put into other files.

from perforatedai import pb_globals as PBG
from perforatedai import pb_models as PBM
from perforatedai import pb_utils as PBU

2 - Network Initialization

A large benefit PAI provides is automatic conversion of networks to work with Dendrite Nodes through the convertNetwork function.

2.1 - Network Conversion

The call to initializePB should be done directly after the model is initialized, before cuda and parallel calls.

model = yourModel()
model = PBU.initializePB(model)

3 - Setup Optimizer and Scheduler

Your optimizer and scheduler should be set this way instead. Optimizer.step() should be kept where it is, but Scheduler will get stepped inside our code so get rid of your scheduler.step() if you have one. We recommend using ReduceLROnPlateau but any scheduler and optimizer should work.

PBG.pbTracker.setOptimizer(torch.optim.Adam)
PBG.pbTracker.setScheduler(torch.optim.lr_scheduler.ReduceLROnPlateau)
optimArgs = {'params':model.parameters(),'lr':learning_rate}
schedArgs = {'mode':'max', 'patience': 5} #Make sure this is lower than epochs to switch
optimizer, PAIscheduler = PBG.pbTracker.setupOptimizer(model, optimArgs, schedArgs)

Get rid of scheduler.step if there is one. If your scheduler is operating in a way
that it is doing things in other functions other than just a scheduler.step this
can cause problems and you should just not add the scheduler to our system.
We leave this uncommented inside the code block so it is not forgotten.

Another note - It seems that weight decay can cause problems with PB learning.  If you currently have weight decay, try without it.

4 - Scores

4.1 Training

Training in general can stay exactly as is. But at the end of your training loop if you would like to track the training score as well you can optionally add:

PBG.pbTracker.addExtraScore(trainingScore, 'Train')

4.2 - Testing

If you run testing periodically at the same time when you run validation (this is recommended) You can also call:

PBG.pbTracker.addTestScore(testScore, 'Test Accuracy')

The test score should obviously not be used in the same way as the validation score for early stopping or other decisions. However, by calculating it at each epoch and calling this function the system will automatically keep track of the affiliated test score of the highest validation score during each neuron training iteration. This will create a CSV file (..bestTestScore.csv) for you that neatly keeps track of the parameter counts of each cycle as well as what the test score was at the point of the highest validation score for that Dendrite count. If you do not call this function it will use the validation score when producing this CSV file. This function should be called before addValidationScore.

4.3 Additional Scores

In additional to the above which will be added to the graph you may want to save scores thare are not the same format. A common reason for this is when a project calculates training loss, validation loss, and validation accuracy, but not training accuracy. You may want the graph to reflect the training and validation loss to confirm experiments are working and both losses are improving, but what is the most important at the end is the validation accuracy. In cases like this just use the following to add scores to the csv files but not to the graphs.

PBG.pbTracker.addExtraScoreWithoutGraphing(extraScore, 'Test Accuracy')

5 - Validation

5.1 Main Validation Requirements

At the end of your validation loop the following must be called so the tracker knows when to switch between Dendrite learning and normal learning

model, restructured, trainingComplete = PBG.pbTracker.addValidationScore(score, 
model) # .module if its a dataParallel
Then following line should be replaced with whatever is being used to set up the gpu settings, including DataParallel
model.to(device)
if(trainingComplete):
    Break the loop or do whatever you need to do once training is over
elif(restructured): if it does get restructured, reset the optimizer with the same block of code you use above.
    optimArgs = your args from above
    schedArgs = your args from above
    optimizer, scheduler = PBG.pbTracker.setupOptimizer(model, optimArgs, schedArgs)
    if you are doing setOptimizerInstance you also need to do the full reinitialization here

Description of variables:

model - This is the model to continue using.  It may be the same model or a restructured model
restructured - This is True if the model has been restructured, either by adding new 
Dendrites, or by incorporating trained dendrites into the model.
trainingComplete - This is True if the training is completed and further training will
not be performed.  
score - This is the validation score you are using to determine if the model is improving.
It can be an actual score like accuracy or the loss value.  If you are using a loss value
be sure when you called initialize() you set maximizingScore to False.

5.2 Separate Validation Functions

If this is called from within a test/validation function you'll need to add the following where the validation step is called

return model, optimizer, scheduler

And then set them all when it is called like this

model, optimizer, scheduler = validate(model, otherArgs)

Additionally make sure all three are being passed into the function because otherwise they won't be defined if the network is not restructured

That's all that's Required!

With this short README you are now set up to try your first experiment. When your first experiment runs it will have a default setting called PBG.testingDendriteCapacity set to True. This will test your system with adding a set of Dendrites to ensure all setup parameters are correct and and the GPU can handle the size of the larger network. Once it has been confirmed your script will output a message telling you the test has compelted. After this message has been received, set this variable to be false to run a real experiment.

While you are testing the dendrite capacity, Warnings will automatically be generated when problems occur which can be debugged using the customization README. This README also contains some suggestions at the end for optimization which can help make results even better. If there are any actual problems that are occuring that are not being caught and shown to you we have also created a debugging README with some of the errors we have seen and our suggestions about how to track them down. Please check the debugging README first but then feel free to contact us if you'd like help. Additionally, to understand the output files take a look at the output README.