Michael Wimsatt's GitHub blog
In the last post, I dug into a year’s worth of weight data, describing the patterns and trends I saw, and offering some explanations for each. Here we’re going to measure results against the theoretical weight loss based on well-established calorie-to-fat (and muscle) formulas to see just how predictive my recorded food and exercise data really was.
##Analyzing calories in and calories out In addition to weighing myself daily, I recorded what I ate and all my exercise using an iPhone app called LoseIt!. I’ll attempt to determine how much of my weight loss was directly attributable to my calorie deficit.
###Calculating calorie deficit First of all, how does one calculate a calorie deficit? Generally, it looks like this:
cal_deficit = base_cal_burn + exercise_cal_burn - cal_consumption
where base_cal_burn
is based on something called your Basal Metabolic Rate, which can be adjusted for “base” activity level, and declines as you lose weight, yielding a total daily energy expenditure (TDEE).
base_cal_burn = tdee(w1y,height=72,age=36,sex='m',multiplier=1.2)
ax = base_cal_burn.plot()
ax.set_ylabel('TDEE - Sedentary (cals)');
As you can see, my base calorie burn declined significantly as I lost weight. This is one of the insidious challenges in weight loss. For a significant change, once you reach your goal you must maintain a much lower calorie intake (or offsetting exercise) to stay there.
exercise_cal_burn = Series(
pd.read_csv('data/ExerciseCalories4392.csv', index_col='Date',
parse_dates=True).sort_index()['Exercise Calories']['2009-06-26':'2010-06-22'])
ax = exercise_cal_burn.plot(style='b.')
ax.set_ylabel('Exercise Calories (cal)');
As you can see, I exercised regularly throughout my weight loss, with the total energy expenditure increasing significantly in 2010 as I started running longer distances.
cal_consumption = Series(
pd.read_csv('data/FoodCalories4392.csv', index_col='Date',
parse_dates=True).sort_index()['Food Calories']['2009-06-26':'2010-06-22'])
ax = cal_consumption.plot(style='b.')
ax.set_ylabel('Calories Consumed (cal)');
It looks like my calorie consumption range was pretty tight earlier on and varied much more widely toward the end of the weight loss period. Also, you’ll note some very low and “zero” calorie consumption days. As an example, on August 15, 2009 I apparently consumed only 285 calories. Here’s what it looks like:
So, apparently I ate a granola bar (typical pre-run snack), went for a three-mile run, ate a banana (typical post-run snack), then ate nothing else the rest of the day. I suppose this is theoretically possible. I’m more inclined to think I got lazy and stopped recording calories.
So, I looked through my calendar and emails from that week. It turns out I was at a family reunion in upstate New York that weekend. I assure you I ate - and drank - plenty that day and the rest of the weekend. So, this is one of a few data points that I need to clean up. In fact, I think I’ll ignore all days that are below a certain threshold on calorie consumption, and assume I had a net calorie deficit of zero. I suspect this is an optimistic assumption, but more on that later.
cal_consumption.hist(color='k', alpha=0.5, bins=20);
I can believe I had days of calorie consumption in the low 1000’s, but anything under 1000 calories is suspect, so I’ll mark those (including the zero days) as NULL values.
cal_consumption = cal_consumption.where(cal_consumption >= 1000)
cal_consumption.hist(color='k', alpha=0.5, bins=20);
That filtered out about 30 points. Those low ones still look a bit suspect, but let’s run with it like this. Those high ones look like outliers, too, but I can’t imagine recording calories I didn’t actually consume, so I think those are real. I like KDE plots, so let’s look at it like that, too.
cal_consumption[cal_consumption.notnull()].plot(kind='kde');
OK, let’s combine it all in a data frame.
from pandas import Timestamp as ts
weightdf = DataFrame({'Weight': w1y, 'Base Calories': base_cal_burn,
'Exercise Calories': exercise_cal_burn,
'Food Calories': cal_consumption})
# Drop partial weeks
drop_dates = ('2009-6-26', '2009-6-27', '2009-6-28', '2010-6-21', '2010-6-22')
for d in drop_dates:
weightdf = weightdf.drop(ts(d))
# Calculate calorie deficit
weightdf['Calorie Deficit'] = weightdf['Base Calories'] + weightdf['Exercise Calories'] - weightdf['Food Calories']
print (weightdf['Weight'][ts('6-29-2009')] - weightdf['Weight'][ts('6-20-2010')])*3500/356
weightdf.describe()
698.033707865
Base Calories Exercise Calories Food Calories Weight Calorie Deficit count 357.000000 357.000000 319.000000 357.000000 319.000000 mean 2616.102449 228.925728 2269.690721 232.448739 595.182921 std 135.540163 308.787380 735.925260 18.176104 779.746546 2405.450116 0.000000 1028.350000 204.200000 -2191.817612 25% 2496.426157 0.000000 1723.000000 216.400000 82.876077 50% 2608.281944 0.000000 2043.630000 231.400000 810.317858 75% 2720.137731 492.029000 2717.205000 246.400000 1146.106112 max 2942.357895 1485.970000 5329.970000 276.200000 2277.650299
On average, I
Also, I lost 71 pounds over the 356 day period, or an average of 0.20 pounds per day. At 3,500 calories per pound, that equates to 700 calories per day. So, it looks like I lost more weight than I should have given the reported calorie data.
The next post will dig into possible sources of error and explore whether the data themselves can help us understand where I went wrong.
comments powered by Disqus