Michael Wimsatt's GitHub blog

About Quantary
Contact Me

View my GitHub profile
View my vanity page

Deep Dive on One Year of Weight Data - Part II

In the last post, I dug into a year’s worth of weight data, describing the patterns and trends I saw, and offering some explanations for each. Here we’re going to measure results against the theoretical weight loss based on well-established calorie-to-fat (and muscle) formulas to see just how predictive my recorded food and exercise data really was.

##Analyzing calories in and calories out In addition to weighing myself daily, I recorded what I ate and all my exercise using an iPhone app called LoseIt!. I’ll attempt to determine how much of my weight loss was directly attributable to my calorie deficit.

###Calculating calorie deficit First of all, how does one calculate a calorie deficit? Generally, it looks like this:

cal_deficit = base_cal_burn + exercise_cal_burn - cal_consumption

where base_cal_burn is based on something called your Basal Metabolic Rate, which can be adjusted for “base” activity level, and declines as you lose weight, yielding a total daily energy expenditure (TDEE).

base_cal_burn = tdee(w1y,height=72,age=36,sex='m',multiplier=1.2)
ax = base_cal_burn.plot()
ax.set_ylabel('TDEE - Sedentary (cals)');

As you can see, my base calorie burn declined significantly as I lost weight. This is one of the insidious challenges in weight loss. For a significant change, once you reach your goal you must maintain a much lower calorie intake (or offsetting exercise) to stay there.

exercise_cal_burn = Series(
        pd.read_csv('data/ExerciseCalories4392.csv', index_col='Date',
        parse_dates=True).sort_index()['Exercise Calories']['2009-06-26':'2010-06-22'])
ax = exercise_cal_burn.plot(style='b.')
ax.set_ylabel('Exercise Calories (cal)');

As you can see, I exercised regularly throughout my weight loss, with the total energy expenditure increasing significantly in 2010 as I started running longer distances.

cal_consumption = Series(
        pd.read_csv('data/FoodCalories4392.csv', index_col='Date',
        parse_dates=True).sort_index()['Food Calories']['2009-06-26':'2010-06-22'])
ax = cal_consumption.plot(style='b.')
ax.set_ylabel('Calories Consumed (cal)');

It looks like my calorie consumption range was pretty tight earlier on and varied much more widely toward the end of the weight loss period. Also, you’ll note some very low and “zero” calorie consumption days. As an example, on August 15, 2009 I apparently consumed only 285 calories. Here’s what it looks like:

So, apparently I ate a granola bar (typical pre-run snack), went for a three-mile run, ate a banana (typical post-run snack), then ate nothing else the rest of the day. I suppose this is theoretically possible. I’m more inclined to think I got lazy and stopped recording calories.

So, I looked through my calendar and emails from that week. It turns out I was at a family reunion in upstate New York that weekend. I assure you I ate - and drank - plenty that day and the rest of the weekend. So, this is one of a few data points that I need to clean up. In fact, I think I’ll ignore all days that are below a certain threshold on calorie consumption, and assume I had a net calorie deficit of zero. I suspect this is an optimistic assumption, but more on that later.

cal_consumption.hist(color='k', alpha=0.5, bins=20);

I can believe I had days of calorie consumption in the low 1000’s, but anything under 1000 calories is suspect, so I’ll mark those (including the zero days) as NULL values.

cal_consumption = cal_consumption.where(cal_consumption >= 1000)
cal_consumption.hist(color='k', alpha=0.5, bins=20);

That filtered out about 30 points. Those low ones still look a bit suspect, but let’s run with it like this. Those high ones look like outliers, too, but I can’t imagine recording calories I didn’t actually consume, so I think those are real. I like KDE plots, so let’s look at it like that, too.


OK, let’s combine it all in a data frame.

from pandas import Timestamp as ts
weightdf = DataFrame({'Weight': w1y, 'Base Calories': base_cal_burn,
                       'Exercise Calories': exercise_cal_burn,
                       'Food Calories': cal_consumption})
# Drop partial weeks
drop_dates = ('2009-6-26', '2009-6-27', '2009-6-28', '2010-6-21', '2010-6-22')
for d in drop_dates:
    weightdf = weightdf.drop(ts(d))

# Calculate calorie deficit
weightdf['Calorie Deficit'] = weightdf['Base Calories'] + weightdf['Exercise Calories'] - weightdf['Food Calories']
print (weightdf['Weight'][ts('6-29-2009')] - weightdf['Weight'][ts('6-20-2010')])*3500/356
           Base Calories  Exercise Calories  Food Calories      Weight  Calorie Deficit
    count     357.000000         357.000000     319.000000  357.000000       319.000000
    mean     2616.102449         228.925728    2269.690721  232.448739       595.182921
    std       135.540163         308.787380     735.925260   18.176104       779.746546
          2405.450116           0.000000    1028.350000  204.200000     -2191.817612
    25%      2496.426157           0.000000    1723.000000  216.400000        82.876077
    50%      2608.281944           0.000000    2043.630000  231.400000       810.317858
    75%      2720.137731         492.029000    2717.205000  246.400000      1146.106112
    max      2942.357895        1485.970000    5329.970000  276.200000      2277.650299

On average, I

  • burned 2,616 + 229 = 2,845 calories per day.
  • consumed 2,270 calories per day.
  • had a calorie deficit of 595 calories per day.

Also, I lost 71 pounds over the 356 day period, or an average of 0.20 pounds per day. At 3,500 calories per pound, that equates to 700 calories per day. So, it looks like I lost more weight than I should have given the reported calorie data.

The next post will dig into possible sources of error and explore whether the data themselves can help us understand where I went wrong.

    comments powered by Disqus