GSoC blog

1. Introduction¶

Picture this... May 17th, late afternoon, sitting on my couch exhausted... Probably watching Netflix and stressing out how I should do some work on my thesis or, at least, do anything else other than be a complete slacker.

I turn on my laptop, take a look at the mail and there's a very pleasant surprise:

Hi antelk,

Your proposal Support of the simulation-based inference with the model fitting toolbox has been accepted!

Welcome to GSoC 2021!

We look forward to seeing the great things you will accomplish this summer with INCF.

I have to be honest... I did not expect to get in, and I was equally thrilled and terrified. Still, more thrilled than terrified and I can proudly introduce myself as a 2021 GSoC student :)

2. Project synopsis¶

I will work on the project Support of the simulation-based inference with the model fitting toolbox together with my mentor Marcel Stimberg and the rest of the model fitting team under the Brian simulator project. Brian is a free and open source, cross-platform simulator for biological neurons and spiking neural networks. It is written in the extremely efficient manner using the Python programming language and the entire code-base is open sourced and available on GitHub.

brian2modelfitting toolbox allows the user to find the best fit unknown parameters for recorded traces and spike trains. The focus of my project is the integration of the sbi library into the brian2modelfitting toolbox to extend currently implemented fitting capabilities for custom excitable single cell models with simulation-based inference techniques. This will enable brian2modelfitting to not only result in a single set of optimal parameters, rather in the full posterior distribution over parameters. It will also expolit prior system knowledge sparsely, using only the most important features to identify mechanistic models which are consistent with the measured data by means of recorded traces and spike trains. sbi library is a completely separate effort and its development is coordinated at the Macke lab.

Relevant URLs:

official summary of the project is available here;
for detailed overview of the project, go check my draft proposal;
my progress can also be tracked via brian2modelfitting project pages on GitHub.

3. First steps¶

During the first period of the project, i.e., from May 17th to June 7th, all students have to participate in something called Community bonding, during which we are required to become familiar with the community practices and processes by actively participating on mailing lists, IRC, Gitter, etc. Me personaly, I waited for a couple of days to have my first virtual meeting with Marcel. He explained the inner workings of GSoC and the INCF (International Neuroinformatics Coordinating Facility), an international non-profit umbrella organization that embrace principles of Open, FAIR and Citable neuroscience, under which Marcel also operates for the duration of the GSoC. We agreed to have casual video calls each week, communicate daily via e-mail, Gitter, and Github issues, but make our best to keep the most of the communication open for the entire community through the official Brian discourse channel.

I delved deep into both brian2modelfitting and sbi documentation. I also went through a couple of relevant tutorials:

official brian2 online tutorials;
examples for brian2modelfitting;
all tutorials and examples that are offered from sbi development team.

I read a couple of recent papers that cover the simulation based inference:

a review paper by Cranmer et al. [2020], where the authors outline principles of modern simulation-based inference supported by likelihood-free statistical approach and machine learning;
a JOSS paper by Tejero-Cantero et al. [2020]., that covers implementation details of sbi library and goes into details of currently implemented families of neural inference algorithms;
preprint by Lueckmann et al. [2021], in which the authors investigate performance of many simulation based inference algorithms;
and eLife article by Gonçalves et al. [2020]., where authors show concrete examples in which the simulation based inference truly shines.

Also, I wanted to refresh my memory on the version control:

./missing-semester/version-control tutorial available here;
Pro Git book, relevant chapters, freely available here.

After the second virtual meeting, Marcel and I agreed that the next steps should be:

installing all dependencies for the future work and creating a development environment;
opening issues on GitHub that would closely resamble tasks for the first coding phase, i.e., from June 7th up to Phase 1 evaluations scheduled on July 12th. All issues can be tracked through the project pages in the brian2modelfitting repository on GitHub.

The third virtual meeting resulted in a discussion on several implementation details and a final agreement on the development of the code itself and best practices for version control.

4. Weekly reports¶

Week 1¶

Jun 7th to June 13th (posted on June 11th)

What have I get done this week?¶

Played with more advanced examples in sbi and went through the Introduction to Brian online workshop.
Started to work on the wrapper class for the first stages of simulation-based inference that will ultimately allow the user to obtain posterior over the unknown parameters with a few simple lines of code. This is related to issue #44.
Investigated dimensionality reduction techniques that automatically extract the relevant features from the traces and, therefore, allow the user to avoid the process of manually extracting summary statistics used to feed a neural density estimator of choice. I brushed up my knowledge on dimensionality reduction by going through the last year's Neuromatch Academy tutorial on the topic and then researched state-of-the-art methods, specifically Time2Vec reccunt reneural network that provides a model-agnostic vector representation for time, Kazemi et al. [2019]. This is related to issue #45.
Found out the way to efficiently store the results from the first stages of simulation based inference that are known to be time consuming. Related to issue #46.
Had the fourth virtual meeting with Marcel.

What am I planning to do next week?¶

To start with the actual implementation of the wrapper class for the first stages of the simulation-based inference.

Is anything blocking my progress?¶

Nothing for now.

Week 2¶

Jun 14th to June 20th (posted on June 18th)

What have I get done this week?¶

Made progress with wrapper class for sbi, issue #44.
Had a virtual meeting with Marcel where we discussed the implementation details of the feature metric by utilizing eFEL library, or any other library for that matter, currently in use in brian2modelfitting toolbox.
Read the paper by Bandt and Pope [2002], in which the authors propose permutation entropy as the simple, light-weight and robust measure of complexity for time series data, as such perfect for the voltage traces we deal with.

What am I planning to do next week?¶

To finish the initial version of the wrapper class for first stages of the simulation-based inference, issue #44.
To start working on issue #46, that deal with loading and storing simulated data traces, and issue #47, that deals with updating and restructuring of the feature metric class.

Is anything blocking my progress?¶

Currently, no. However, I planned to do things a bit faster.

Week 3¶

Jun 21th to June 27th (posted on June 25th)

What have I get done this week?¶

Did a research on potential techniques of extracting summary features from electrophysiological recordings. I found out interesting package called antropy that provides multiple time-efficient algorithms for computing the complexity of time-series. Decided to try it out on the example of simulated voltage traces from the Hodgkin-Huxley neuron model and proposed a pull request where the correlation between entropy metrices for voltage traces is showcased. My example has been merged to the master branch of antropy repository, and is now available as the official example :)
Finished the first version of the wrapper class that will allow the user to perform the simulation-based inference through the model fitting toolbox.
Got my pull request #49 merged:
- resolved issue #44 - wrapper class for sbi,
- and issue #50 - support of all inference techniques.
Had a virtual meeting with Marcel where we had a discussion on the best way to load and store simulated results and the trained neural density estimator to be reused for either new experiments with the same cell or entierely different excitable cells. This is related to issue #46.
Summary of the progress and to-do list are available in Project pages in the brian2modelfitting repository on GitHub.

What am I planning to do next week?¶

To resolve issue #46.
Investigate additional visualization techniques of the parameter space, issue #51.

Is anything blocking my progress?¶

Week 4¶

Jun 28th to Jul 4th (posted on Jul 2nd)

What have I get done this week?¶

Resolved issue #46 and proposed a new pull request #52.
Read the paper on automatic fitting of spiking neuron models to electrophysiological recordings by Rossant et al. [2010] as the first step into better understanding of the rest of model fitting toolbox and fitting methods that are utilized throughout. Namely, during the weekly meeting with Marcel, the idea mentioned was that the Inference and Fitter classes could potentially be refactorized in a way that they have a common superclass.
Had a weekly meeting with Marcel where we mostly discussed the changes that have to be done in pull request #52 regarding the storing and loading of the generated data.

What am I planning to do next week?¶

Update pull request #52.
Some code refactoring.
Start the initial work on issue #47 that deals with large-scale simulations.

Is anything blocking my progress?¶

Was fighting a few bugs during the first part of the week that were resolved by Marcel on our weekly meeting.

Week 5¶

Jul 5th to Jul 11th (posted on Jul 9th)

What have I get done this week?¶

Did a few updates on a pull request #52, mostly refactoring and some fixes.
Created an advanced example on how to use all features of sbi through model fitting toolbox.
Had a virtual meeting with Marcel where we had a discussion on future work, mostly regarding additional working examples and tests.

What am I planning to do next week?¶

Finalize all the work that has to be done with regards to pull request #52.
Adjust the example on how to use brian2 with sbi in the official advanced section of brian2 examples, available here, to the current API of the model fitting toolbox.
Create an example where it is shown how to deal with issue #47.
Start to work on documenting the project.

Is anything blocking my progress?¶

During the beginning of this week I was slowed down due to work commitments that took up most of my time.

Week 6¶

Jul 12th to Jul 18th (posted on Jul 16th)

What have I get done this week?¶

Had a pull request #52 finally merged into the sbi_support branch.
Had a virtual meeting with Marcel where the rest of the work for the project is defined:
- Additional visualization exploration of sampled posterior distribution of unknown parameters, issue #51
- Supporting multiple output variables to be observed when the inference is performed, issues #53 and #54
- Additional refactoring of the Inferencer class, issue #57
- Start with tests and docs, issues #56 and #55, respectively.
Had some work done with regards to issues #45, #47 and #51.

What am I planning to do next week?¶

Solve issues #47 and #51.
Continued my work on the example on how to use brian2 with sbi in the official advanced section of brian2 examples, available here, to the current API of the model fitting toolbox.

Is anything blocking my progress?¶

Not at this point.

Week 7¶

Jul 19th to Jul 25th (posted on Jul 23rd)

What have I get done this week?¶

Resolved issues #47 and #51, that deal with large-scale simuations handling and visualization of the posterior, respectively.
Proposed pull request #58 that resolves issue #51.
Continued discussion on issue #45 with sbi developers, this issue should be resolved by the end of this week

What am I planning to do next week?¶

Continue the work on proposed pull request due, and continue research on dimensionality reduction, issue #45.
Add multiple output inference support and support of spikes as the output variable as defined in issues #53 and #54, respectively.

Is anything blocking my progress?¶

Week 8¶

Jul 26th to Aug 1st (posted on Aug 2nd)

What have I get done this week?¶

Finilized the work on dimensionality reduction. The issue #45 should be closed together with issues #53 and #54 in the same pull request.

What am I planning to do next week?¶

Continue the work from last week. Should get up to speed because I did virtually no work this week due to health issues.

Is anything blocking my progress?¶

Health issues in the first part of the week. Also few bugs that slowed down any possible progress that should be resolved by the end of this week. I also took a short break durring the weekend.

Week 9¶

Aug 1st to Aug 8th (posted on Aug 7th)

What have I get done this week?¶

PR #58 has been merged:
- Resolved issue #45 - the user is able to pass empty list of parameters and brian2modelfitting will by default turn on sbi automatic feature extraction
- Resolved issue #51 - the user can investigate posterior distribution in more detail by using conditional pair plotting and computing conditional correlation matrix
Proposed PR #59 that resolves issue #53 - instead of list of callables, where each element of the list is used to extract single feature from the trace, the user is able to pass a dictionary mapping each output variable name to a list of features. This PR also resolves issue #48 as described in my comment here.
Started initial work to support spikes in the Inferencer class.

What am I planning to do next week?¶

Solve issue #54 that will enable the support of spikes for sbi.
Consider refactoring of the Inferencer and Fitter class.
Wrap up all the work: create a comprehensive example that will be a part of documentation, compile docs and start the initial work on tests.

Is anything blocking my progress?¶

Week 10¶

Aug 9th to Aug 15th (posted on Aug 14th, updated on Aug 15th)

What have I get done this week?¶

PR #59 has been merged:
- Resolves the issue #53 - instead of providing list of callables that are used to extract summary features out of the neural recordings, the dictionary should be provided instead. Keys are the names of the observed output variables (voltage traces, channel activations, spike trains...) as defined in the model equations, while the respective values should be lists where each element of the list is callable that returns a summary feature.
- New examples are added.
PR #60 has been merged:
- Resolves the issue #54 - spikes are now handled as any other output variable.
- Updated docstrings.
- Errors fixed.
- Improved user input handling.
PR #61 has been merged:
- Resolves the issue #62 - training of the neural density estimator is now possible using GPU.
- Updated docstrings.
Wraped up all the work regarding the code: all examples are updated to the current state of the API.
Started with documentation which is, for now, available at https://brian2modelfitting.readthedocs.io/en/sbi_support/, and is the subject of constant changes.
Provided Marcel with the entire code.
Current state of the sbi integration into the brian2modelfitting is available in the Project pages in the brian2modelfitting GitHub repository.

What am I planning to do next week?¶

Write up the documentation which will be available at https://brian2modelfitting.readthedocs.io/en/sbi_support/ untill the branch sbi_support is merged to the master branch. After the merge, the full documentation will be available at https://brian2modelfitting.readthedocs.io.
Testing the code.

Is anything blocking my progress?¶

Final Week¶

Aug 16th to Aug 22nd (posted on Aug 23rd)

I did some final work on the documentation and created a comprehensive tutorial on how to perform simulation-based inference by using brian2modelfittin. I also did some smaller changes in the code while testing and writing documentation. Finally, I proposed the last pull request for this project. The pull request #60 summarizes all the work done during the (GSoC) 2021 and it serves as a final work product that is used for the second round of evaluations.

5. Conclusion¶

Tl;dr: GSoC was a lot of fun... hard, but fun!

Considering my original project plan, I would say that I satisfied all features that Marcel and I agreed on during the initial stage of the project itself. The only thing I would do differently is to start with tests earlier.

It was definitely not easy to balance obligations at the university, health problems that can happen (and that happened to me during this project) and respecting deadlines and requirements defined together with the mentor. However, this was a great experience and I would do it again. Work habits, speed of learning of new, foreign concepts, programming skills... all of this, at least in my case, has improved exponentially. In addition, my communication skills, especially as a non-native English speaker, have improved - I have become more confident when addressing, often extremely big names in this area. Of course, the knowledge I gained, which spilled over from my field of computational electromagnetism to computational neuroscience, helped me broaden my horizons and think about how to apply the newly acquired skills to my own research.

So my advice to any potential 2022 GSoC student is: do not hesitate to sign up, do not hesitate to get in touch with anyone who can help you to become a GSoC student, be a good person, and do what you love and do your best. Note that the GSoC can be very hard, sometimes extremely stressful, but it all pays off once you submit the last pull request :)

6. References¶

Christoph Bandt, Bernd Pompe. Permutation entropy: A natural complexity measure for time series, Physical Review Letters 2002, 88, 174102; doi: 10.1103/physrevlett.88.174102

Kyle Cranmer, Johann Brehmer, Gilles Louppe. The frontier of simulation-based inference, Proceedings of the National Academy of Sciences Dec 2020, 117 (48) 30055-30062; doi: 10.1073/pnas.1912789117

Pedro J Gonçalves, Jan-Matthis Lueckmann, Michael Deistler, Marcel Nonnenmacher, Kaan Öcal, Giacomo Bassetto, Chaitanya Chintaluri, William F Podlaski, Sara A Haddad, Tim P Vogels, David S Greenberg, Jakob H Macke. Training deep neural density estimators to identify mechanistic models of neural dynamics, eLife 2020 9:e56261; doi: 10.7554/eLife.56261

Seyed Mehran Kazemi, Rishab Goel, Sepehr Eghbali, Janahan Ramanan, Jaspreet Sahota, Sanjay Thakur, Stella Wu, Cathal Smyth, Pascal Poupart, Marcus Brubaker. Time2Vec: Learning a Vector Representation of Time, preprint 2019

Jan-Matthis Lueckmann, Jan Boelts, David S Greenberg, David, Pedro J Gonçalves, Jakob H Macke. Benchmarking Simulation-Based Inference. preprint 2021

Cyrille Rossant, Dan F. M. Goodman, Jonathan Platkiewicz and Romain Brette. Automatic fitting of spiking neuron models to electrophysiological recordings, Frontiers in Neuroinformatics 2010, 4; doi: 10.3389/neuro.11.002.2010

Alvaro Tejero-Cantero, Jan Boelts, Michael Deistler, Jan-Matthis Lueckmann, Conor Durkan, Pedro J Goncalves, David S Greenberg, Jakob H Macke. sbi: A toolkit for simulation-based inference, Journal of Open Source Software 2020, 5. 2505; doi: 10.21105/joss.02505

Journal

Table of contents¶

1. Introduction¶

2. Project synopsis¶

3. First steps¶

4. Weekly reports¶

Week 1¶

What have I get done this week?¶

What am I planning to do next week?¶

Is anything blocking my progress?¶

Week 2¶

What have I get done this week?¶

What am I planning to do next week?¶

Is anything blocking my progress?¶

Week 3¶

What have I get done this week?¶

What am I planning to do next week?¶

Is anything blocking my progress?¶

Week 4¶

What have I get done this week?¶

What am I planning to do next week?¶

Is anything blocking my progress?¶

Week 5¶

What have I get done this week?¶

What am I planning to do next week?¶

Is anything blocking my progress?¶

Week 6¶

What have I get done this week?¶

What am I planning to do next week?¶

Is anything blocking my progress?¶

Week 7¶

What have I get done this week?¶

What am I planning to do next week?¶

Is anything blocking my progress?¶

Week 8¶

What have I get done this week?¶

What am I planning to do next week?¶

Is anything blocking my progress?¶

Week 9¶

What have I get done this week?¶

What am I planning to do next week?¶

Is anything blocking my progress?¶

Week 10¶

What have I get done this week?¶

What am I planning to do next week?¶

Is anything blocking my progress?¶

Final Week¶

5. Conclusion¶

6. References¶