Package - PPPLDeepLearning/plasma-python

FRNN Build Status

Package description

Fusion Recurrent Neural Net (FRNN) is a Python package implementing deep learning models for disruption prediction in tokamak fusion plasmas.

It consists of 4 core modules:

  • models: Python classes necessary to construct, train and optimize deep RNN models. Including a distributed data-parallel synchronous implementation of mini-batch gradient descent. FRNN makes use of MPI for communication and supports Tensorflow and Theano backends through Keras. FRNN allows running hyperparameter search optimizations

  • preprocessors: signal preprocessing and normalization classes, including the methods necessary to prepare physical data for stateful LSTM training.

  • primitives: contains abstractions specific to the domain, implemented as Python classes. For instance: Shot - a measurement of plasma current as a function of time. The Shot object contains attributes corresponding to unique identifier of a shot, disruption time in milliseconds, time profile of the shot converted to time-to- disruption values, validity of a shot (whether plasma current reaches a certain value during the shot), etc. Other primitives include Machines and Signals which carry the relevant information necessary for incorporating physics data into the overall pipeline. Signals know the Machine they live on, their mds+ paths, code for being downloaded, preprocessing approaches, their dimensionality, etc. Machines know which Signals are defined on them, which mds+ server houses the data, etc.

  • utilities: a set of auxiliary functions for preprocessing, performance evaluation and learning curves analysis.

In addition to the utilities FRNN supports TensorBoard scaler variable summaries, histogramms of layers, activations and gradients and graph visualizations.

This is a pure Python implementation for Python versions 2.7 and 3.6.

Installation

The package comes with a standard setup script and a list of dependencies which include: mpi4py, TensorFlow, Theano, Keras, h5py, Pathos. It also requires a standard set of CUDA drivers to run on GPU.

Then checkout the repo and use the setup script:

git clone https://github.com/PPPLDeepLearning/plasma-python
cd plasma-python
python setup.py install

with sudo if superuser permissions are needed or --home=~ to install in a home directory. The latter option requires an appropriate PYTHONPATH.

Alternatively run (no need to checkout the repository in that case):

pip install -i https://testpypi.python.org/pypi plasma

optionally add --user to install in a home directory.

Module index

The Sphinx pages for FRNN are building up here: http://tigress-web.princeton.edu/~alexeys/docs-web/html/

Tutorials

For tutorial check: https://github.com/PPPLDeepLearning/plasma-python/blob/mpicc-travis/docs/PrincetonUTutorial.md

Github

link
Stars: 17

Advertisement

Releases

v0.9 - May 11, 2017

This release version corresponds to the GTC2017 presentation.

Establish FRNN code workflow similar to that characteristic of typical distributed deep learning projects. First, the raw data is preprocessed and normalized. The pre-processing step involves cutting, resampling, and structuring the data - as well as determining and validating the disruptive properties of the shots considered. Various options for normalization are implemented.

Keep structure of Fusion Recurrent Neural Net (FRNN) deep learning code modular, with 4 main modules:

  1. models: Python classes necessary to construct, train and optimize deep RNN models. Including a distributed data-parallel implementation of mini-batch gradient descent with MPI

  2. preprocessors: signal preprocessing and normalization classes, including the methods necessary to prepare physical data for stateful RNN training.

  3. primitives: contains abstractions specific to the domain implemented as Python classes. For instance: Shot - a measurement of plasma current as a function of time. The Shot object contains attributes corresponding to unique identifier of a shot, disruption time in milliseconds, time profile of the shot converted to time-to- disruption values, validity of a shot (whether plasma current reaches a certain value during the shot), etc

  4. utilities: a set of auxiliary functions for preprocessing, performance evaluation and learning curves analysis