Blog

Jacob Roshgadol

Key Lessons From 600+ Hours of dLight Recordings

Chapter 3 – Analysis

Picture a squiggly line dancing across your screen, whether it's GCaMP for calcium or dLight for dopamine, reacting in real time. Sounds simple, right? It’s not. After 600+ hours of recording, I’ve learned that these squiggly lines get messy fast. Let’s break it down.

The Photobleacing Problem

The culprit? Photobleaching. At first, your recordings look great – tons of photons lighting up the screen. But as time goes on, those photons start bailing on you, and the fluorescence dims. The result? A baseline that drifts away, getting lower and lower.

Photobleaching wrecks your baseline, making it hard to tell if you’re seeing real signal changes or just noise. Most people deal with this by fitting a baseline curve using some sort of regression method and using that curve to perform baseline correction in the dFF calculation.

The numerator is your reset button – it pulls your signal back to zero, so the ups and downs make sense, even when photobleaching tries to mess things up. The denominator adjusts for baseline changes throughout the session.

The dFF calculation is a relatively simple equation, but as I explain in the next section, the way you define your baseline can have a major influence on the result. That is not good. Your analysis method shouldn’t give you a different result that might lead you to a different interpretation.

When baselining gets tricky

I promise I won't get too quantum physics on you with this analogy: think of measuring a fluorescence signal like guessing where an electron is in orbit—it’s everywhere until you measure it. The same goes for your sensor: until you hit record, you don't truly know where your baseline lies.

Why is this important? Let’s use cLight, the CRF sensor, to explain this. CRF is a neuropeptide that’s heavily involved in regulating the stress response. When your animal is in their home cage relaxing, there is probably very little CRF activity. As you move the animal to couple the patch cord and start the photometry recording, you are going to stress them out, causing an increase in CRF (and cLight) activity.

But you haven’t started recording photometry signals yet, so you are missing this change in baseline. The moment you start capturing cLight activity of the sensor, the sensor baseline is going to be elevated, at least compared to the baseline of when the animal was in the home cage. So, the baseline we record is not “true baseline.” (The best way to correct for this is to habituate the animals to the patch cording procedure before your experiment.)

What is the “true” baseline?

We can't know the exact baseline using optical methods, but we can get a better estimate by measuring the sensor's lifetime (more on that in a future post).

GPCR-based sensors, like cLight, are more sensitive to baseline concentration changes compared to GCaMP because calcium levels remain relatively stable over time. Activities that cause stress increase CRF levels, and since cLight is expressed on the cell membrane, it continuously detects CRF as long as it’s present in the extracellular space. In contrast, GCaMP, expressed in the cytosol, responds to calcium, which stays within a specific range during the action potential cycle, making it less prone to baseline fluctuations.

What can we do if we don’t have a “true” baseline?

Here’s the thing: dLight and GRAB-DA can’t tell you how much dopamine is floating around. They only show you the changes between point A and point B. Big fluorescence changes mean something’s happening—but you won’t know how much dopamine is in play.

One additional example to drive this point home: you can infer that accumbens has way more dopamine than prefrontal cortex because the CMOS photon count/photodetector voltage level is much higher when recording in accumbens than cortex, but this does not tell you how much more dopamine is in accumbens. You can only conclude that there is more.

Ok, I get ‘relative concentration change,’ but what’s that gotta do with baselining

Before you baseline, spend time with your raw data. If your transients are clear in the raw signal, they should show up in dFF. If not, something’s off with your baseline method. Simple.

As much as I appreciate all the nice fiber photometry analysis packages out there like pMAT from the Barker lab and GuPPY from the Lerner lab, I’m afraid that these packages have made people (including myself) complicit in jumping straight into dFF calculations. The best thing to do is plot the output from each stage of the pre-processing as you go through it: raw data, baseline fit, baseline corrected (photobleach corrected), and dFF calculation (scaling).

For the record, I’ve tried pMAT, GuPPY, and many others - I name these two since I've spent the most time with them. They’re all great, but the engineer in me likes to write my own code. Everything I write is inspired by other people's code, but I usually find a way of making it my own by adding plenty of nested for loops. (Please ping me if you'd like me to share my analysis code.)

Baselining Options

Baselining depends on your setup and the sensor. For GCaMP with an isosbestic channel, most people fit a regression line to the isosbestic signal and use it to predict the baseline for the ligand channel.

As I explained above, GCaMP is unique compared to the GPCR-based sensors, as calcium concentrations are relatively stable, and the excitation/emission curve is flat around the isosbestic wavelength. dLight has a bimodal excitation/emission curve with a small bump around isosbestic (430nm) and a larger bump around peak excitation (490nm).

Here are analysis methods I’ve tried with dLight (and other GPCR-based sensors):

Fitting an exponential curve for the baseline, then calculate dFF

Fitting a biexponential baseline curve, then calculate dFF

Z-score normalization: calculate the mean and standard deviation of the entire recording session, and then

Moving average baseline: choose a certain number of frame (I like to choose as many frames corresponds to 2 seconds, since dLight transients are usually pretty quick)

Last but not least, my personal favorite, the “peakutils.baseline” function in Python
- This function fits a polynomial to the data, but with each iteration of the fitting, it reduces the regression weights for the portions of the data where there are peaks and/or troughs
- This ensures the polynomial will only be fit to the portions of the data that appear to be stable, while accounting for the decay from photobleaching
- Note: this method isn't great for experiments where you are detecting baseline changes, such as with DREADDs or other pharmacological agents

Once you have dFF calculated, the rest of the analysis is easy. If you’re looking for the average dFF response to an event of interest during a behavior like a foot shock or a reward collection, you simply average together the dFF signals from all these events. If you want to compare the response to different events, you can find the response properties around each event (peak height, area under the curve, full width half maximum, rise/fall time, etc.) and run statistics on those.

Conclusion for Chapter 3 - Analysis

That’s enough for now (except for a quick note on z-scoring below). I suggest experimenting with different analysis methods before locking in your pipeline. GUPPY or pMAT are solid options, but be sure you understand what’s going on under the hood. And if you’re not a programming whiz, don’t worry—there are awesome tools out there to help. I use Github Copilot (it’s free with an academic email), and it’s been a huge help in tidying up my clunky code. It's also great at suggesting ideas since it's trained on all of Github, so it's basically replaced StackExchange for me. I’ve also heard good things about Cursor for generating code from "English language text" and plan to check it out soon.

Conclusion for Key Lessons From 600+ Hours of dLight Recordings Mini-Series

In the last three posts, I’ve covered the key steps for fiber photometry—surgeries, recordings, and analysis. My goal is to help fellow grad students just starting out with these experiments. When I began, most resources didn’t explain the 'why' behind the methods. Maybe it's because I’ve taken few biology courses, or maybe it’s just my skeptical mindset, but I’m always asking 'why.' I hope my explanations and analogies make things clearer. If anything’s still unclear, feel free to reach out.

A quick aside on z-scoring photometry signals: In addition to session z-scoring, a lot of people like to do an “event normalization” when analyzing fiber photometry data. This is where they calculate the z-score of their photometry signal relative to an event of interest and then average the event z-scores. This is extremely useful for making robust fluorescence transients really stand out, except in certain circumstances.

The z-score takes your fluorescence changes and puts them in terms of standard deviations. The standard deviation of a GPCR-based sensor varies quite widely from sensor to sensor and brain region to brain region. For example, dLight3 in dorsolateral striatum always gives an extremely large and high-frequency signal regardless of what the animal is doing. This means the standard deviation is high all the time, so when you calculate a session or event z-score, you might minimize the z-score in response to the event of interest instead of making it pop out.

‹ Back

Key Lessons From 600+ Hours of dLight Recordings

Chapter 3 – Analysis

Picture a squiggly line dancing across your screen, whether it's GCaMP for calcium or dLight for dopamine, reacting in real time. Sounds simple, right? It’s not. After 600+ hours of recording, I’ve learned that these squiggly lines get messy fast. Let’s break it down.

The Photobleacing Problem

The culprit? Photobleaching. At first, your recordings look great – tons of photons lighting up the screen. But as time goes on, those photons start bailing on you, and the fluorescence dims. The result? A baseline that drifts away, getting lower and lower.

Photobleaching wrecks your baseline, making it hard to tell if you’re seeing real signal changes or just noise. Most people deal with this by fitting a baseline curve using some sort of regression method and using that curve to perform baseline correction in the dFF calculation.

The numerator is your reset button – it pulls your signal back to zero, so the ups and downs make sense, even when photobleaching tries to mess things up. The denominator adjusts for baseline changes throughout the session.

The dFF calculation is a relatively simple equation, but as I explain in the next section, the way you define your baseline can have a major influence on the result. That is not good. Your analysis method shouldn’t give you a different result that might lead you to a different interpretation.

When baselining gets tricky

I promise I won't get too quantum physics on you with this analogy: think of measuring a fluorescence signal like guessing where an electron is in orbit—it’s everywhere until you measure it. The same goes for your sensor: until you hit record, you don't truly know where your baseline lies.

Why is this important? Let’s use cLight, the CRF sensor, to explain this. CRF is a neuropeptide that’s heavily involved in regulating the stress response. When your animal is in their home cage relaxing, there is probably very little CRF activity. As you move the animal to couple the patch cord and start the photometry recording, you are going to stress them out, causing an increase in CRF (and cLight) activity.

But you haven’t started recording photometry signals yet, so you are missing this change in baseline. The moment you start capturing cLight activity of the sensor, the sensor baseline is going to be elevated, at least compared to the baseline of when the animal was in the home cage. So, the baseline we record is not “true baseline.” (The best way to correct for this is to habituate the animals to the patch cording procedure before your experiment.)

What is the “true” baseline?

We can't know the exact baseline using optical methods, but we can get a better estimate by measuring the sensor's lifetime (more on that in a future post).

GPCR-based sensors, like cLight, are more sensitive to baseline concentration changes compared to GCaMP because calcium levels remain relatively stable over time. Activities that cause stress increase CRF levels, and since cLight is expressed on the cell membrane, it continuously detects CRF as long as it’s present in the extracellular space. In contrast, GCaMP, expressed in the cytosol, responds to calcium, which stays within a specific range during the action potential cycle, making it less prone to baseline fluctuations.

What can we do if we don’t have a “true” baseline?

Here’s the thing: dLight and GRAB-DA can’t tell you how much dopamine is floating around. They only show you the changes between point A and point B. Big fluorescence changes mean something’s happening—but you won’t know how much dopamine is in play.

One additional example to drive this point home: you can infer that accumbens has way more dopamine than prefrontal cortex because the CMOS photon count/photodetector voltage level is much higher when recording in accumbens than cortex, but this does not tell you how much more dopamine is in accumbens. You can only conclude that there is more.

Ok, I get ‘relative concentration change,’ but what’s that gotta do with baselining

Before you baseline, spend time with your raw data. If your transients are clear in the raw signal, they should show up in dFF. If not, something’s off with your baseline method. Simple.

As much as I appreciate all the nice fiber photometry analysis packages out there like pMAT from the Barker lab and GuPPY from the Lerner lab, I’m afraid that these packages have made people (including myself) complicit in jumping straight into dFF calculations. The best thing to do is plot the output from each stage of the pre-processing as you go through it: raw data, baseline fit, baseline corrected (photobleach corrected), and dFF calculation (scaling).

For the record, I’ve tried pMAT, GuPPY, and many others - I name these two since I've spent the most time with them. They’re all great, but the engineer in me likes to write my own code. Everything I write is inspired by other people's code, but I usually find a way of making it my own by adding plenty of nested for loops. (Please ping me if you'd like me to share my analysis code.)

Baselining Options

Baselining depends on your setup and the sensor. For GCaMP with an isosbestic channel, most people fit a regression line to the isosbestic signal and use it to predict the baseline for the ligand channel.

As I explained above, GCaMP is unique compared to the GPCR-based sensors, as calcium concentrations are relatively stable, and the excitation/emission curve is flat around the isosbestic wavelength. dLight has a bimodal excitation/emission curve with a small bump around isosbestic (430nm) and a larger bump around peak excitation (490nm).

Here are analysis methods I’ve tried with dLight (and other GPCR-based sensors):

Fitting an exponential curve for the baseline, then calculate dFF

Fitting a biexponential baseline curve, then calculate dFF

Z-score normalization: calculate the mean and standard deviation of the entire recording session, and then

Moving average baseline: choose a certain number of frame (I like to choose as many frames corresponds to 2 seconds, since dLight transients are usually pretty quick)

Last but not least, my personal favorite, the “peakutils.baseline” function in Python
- This function fits a polynomial to the data, but with each iteration of the fitting, it reduces the regression weights for the portions of the data where there are peaks and/or troughs
- This ensures the polynomial will only be fit to the portions of the data that appear to be stable, while accounting for the decay from photobleaching
- Note: this method isn't great for experiments where you are detecting baseline changes, such as with DREADDs or other pharmacological agents

Once you have dFF calculated, the rest of the analysis is easy. If you’re looking for the average dFF response to an event of interest during a behavior like a foot shock or a reward collection, you simply average together the dFF signals from all these events. If you want to compare the response to different events, you can find the response properties around each event (peak height, area under the curve, full width half maximum, rise/fall time, etc.) and run statistics on those.

Conclusion for Chapter 3 - Analysis

That’s enough for now (except for a quick note on z-scoring below). I suggest experimenting with different analysis methods before locking in your pipeline. GUPPY or pMAT are solid options, but be sure you understand what’s going on under the hood. And if you’re not a programming whiz, don’t worry—there are awesome tools out there to help. I use Github Copilot (it’s free with an academic email), and it’s been a huge help in tidying up my clunky code. It's also great at suggesting ideas since it's trained on all of Github, so it's basically replaced StackExchange for me. I’ve also heard good things about Cursor for generating code from "English language text" and plan to check it out soon.

Conclusion for Key Lessons From 600+ Hours of dLight Recordings Mini-Series

In the last three posts, I’ve covered the key steps for fiber photometry—surgeries, recordings, and analysis. My goal is to help fellow grad students just starting out with these experiments. When I began, most resources didn’t explain the 'why' behind the methods. Maybe it's because I’ve taken few biology courses, or maybe it’s just my skeptical mindset, but I’m always asking 'why.' I hope my explanations and analogies make things clearer. If anything’s still unclear, feel free to reach out.

A quick aside on z-scoring photometry signals: In addition to session z-scoring, a lot of people like to do an “event normalization” when analyzing fiber photometry data. This is where they calculate the z-score of their photometry signal relative to an event of interest and then average the event z-scores. This is extremely useful for making robust fluorescence transients really stand out, except in certain circumstances.

The z-score takes your fluorescence changes and puts them in terms of standard deviations. The standard deviation of a GPCR-based sensor varies quite widely from sensor to sensor and brain region to brain region. For example, dLight3 in dorsolateral striatum always gives an extremely large and high-frequency signal regardless of what the animal is doing. This means the standard deviation is high all the time, so when you calculate a session or event z-score, you might minimize the z-score in response to the event of interest instead of making it pop out.

margin-top: auto;