Critical analysis of McSwiney et al’s 2017 keto study

Authors
Adam Tzur (FBSCI-FIT)
Anthony Roberts (FB, Medium)

Acknowledgements
Conrad Earnest (FB)

Length: ~3500 words
Published: 06.11.2017 (updated: 06.11.2017 )


Plain language summary

A ketogenic diet study has recently been published by McSwiney et al., 2017. The study looks at how keto compares to a high carb diet for endurance performance, body composition, and more.

The study seems promising, and it has some strengths. For example, it is a long-term study that also measures ketones in endurance athletes. It also shows that the ketogenic diet leads to weight loss, even when participants are not asked to restrict calories.

However, on thorough analysis, several issues present themselves:

  • Study groups are not randomized. This means that the subjects selected their own groups.
  • The authors report that endurance performance improved, but the improvements are calculated relative to body weight. You can maintain your performance during a study. Yet, if you lose weight, your performance per kilogram body weight improves. This type of data reporting is misleading.
  • The authors miscite other studies
  • There were baseline differences in body weight and caloric intake (keto weighed about 10kg more and ate about 400 kcal more than the high carb group). This is due to a lack of randomization, as mentioned above.
  • The title of the study does not reflect the data (I recalculated the data to show absolute values rather than relative values)
  • There is much more. Go down to the limitations section to read more

Due to the issues in the study, the following title is more representative of the findings:

  • “Weight loss leads to relative, but not absolute improvements in 2/6 performance-related variables in endurance athletes on a ketogenic diet, in a nonrandomized trial”

Want to learn more about the ketogenic diet for fat loss, muscle building, and performance? Read our review

The article continues below

Newsletter

Sign up for the newsletter for more content.

Study facts

Study title

Keto-adaptation enhances exercise performance and body composition responses to training in endurance athletes (McSwiney et al., 2017)

Methodology of the study

Goal

The researchers compared a high carb diet to a low carb ketogenic diet. They wanted to test which diet is superior for long-term athletic performance.

Duration

The study lasted 12 weeks. We can consider it a long-term keto performance study, relative to the rest of the literature.

Choice of groups

The authors opted for a non-randomized trial, allowing subjects to choose whether they wanted to be in the keto or high carb group. They did this to promote maximal adherence to the diet.

Subjects

There were originally 47 male participants, but by the end of the study only 20 were left. The participants were endurance athletes who trained >7 hours per week with >2 years training experience. They also competed in endurance events.

At the start of the study, the participants had a mean age of 33 years, 80 kg body weight, and a BMI of 24.7. All subjects habitually ate carbs >50% of their daily caloric intake.

Measurements and tests

Before subjects started their respective diets, they did a number of baseline tests:

  • Body weight was measured using a weight scale
  • Body composition was measured using DXA
  • Fasting blood samples were collected


After these measurements, the subjects prepared for an exercise test:

“(...) participants were allowed 2 hours to “fuel up” prior to the exercise trial (...) Each group was allowed to self-select their pre exercise carbohydrate based meal, to ensure habitual dietary practices and performance measurements were obtained.”

The first performance test was six second sprint. Subjects rode a Wattbike to test peak and average power output. 1Note that “a relative load of 0.5 of air resistance was applied for every 5kgs of body weight”. Hence, lighter subjects used less resistance than heavier subjects.

For the second test, subjects rode their bike for a 100 km timed trial. The researchers measured VO2, VCO2, and lactate during this trial.

Right after the timed trial, a three minute critical power test was performed. They tested peak VO2, peak power, and average power in watts. Subjects were asked to maximize their power output.

Diet

High carb group macronutrient goals:

  • 65% carbohydrates
  • 20% fat
  • 14% protein

“[High carb] participants were instructed to consume carbohydrates based on their daily  energy requirements”

Keto group macronutrient goals:

  • >75% fat
  • 10-15% protein
  • <50 grams of carbs

The keto group was asked to eat as much fat as they wanted while following the protein and carb guidelines. The keto group also consumed extra electrolytes due to the dehydrating effects of keto.

Training

“Each group received the same training intervention, with endurance training (cycling and running), strength training and high intensity interval training (HIIT)”

“Each participant completed 7+ hours a week endurance training (moderate intensity 56 – 68% VO 2 max), 2 strength sessions; 6 sets of 8-10 reps on a leg press, or free squat (70 – 80% of participants 1RM), and 2 HIIT sessions /week (10 sets of 1 minute bouts at 70% peak power with 1 minute recovery).”

Statistical analyses

  • Data are presented as mean±SD.
  • Data tested for normality
  • Effect sizes shown as partial eta-squared (ηp2).
  • Confidence intervals not shown.
  • ANCOVA for between-group differences

Ketones

“Fasting Beta-hydroxybutyrate (βHB) concentrations were determined at baseline and post -testing”

Results

After the 12 weeks, the baseline measurements and tests were done again. However, this time, the keto group did not consume a carbohydrate rich meal before the exercise tests. The high carb group did consume plenty of carbs before and during the tests.

Diet

The blue cross indicates a statistically significant between-group difference

Body composition

The blue cross indicates a statistically significant between-group difference

Athletic performance (VO2max and timed trial)

The athletes had an absolute VO2max of 4-4.6 l/min, which can be considered good according to data from the University of Washington. The Uni suggests that world-class athletes have a VO2max of 5.3 l/min while training college students are at 3.9 l/min.

Power performance

Absolute values were calculated by multiplying w/kg with body weight at the time of measurement

Adherence to diet and exercise

The keto group exercised more than the high carb group, though the authors report no statistically significant differences:

“There was no significant difference in HC and LCKD group’s number of HITT (HC: 18.2 ± 2.0, versus LCKD: 19.7 ± 2.3), strength sessions (HC: 17.8 ± 2.1, versus LCKD: 18.3 ± 3.9) or hours endurance training (HC: 11.1 ± 1.7, versus LCKD: 13.0 ± 2.8) completed per week.”

According to the self-reported food intakes and ketone measurements, the keto group adhered to their diet. Daily carbs: 41.1 grams, BHB levels: 0.5 mmol/L. Ketosis is reached at this level.

5 keto subjects dropped out because the diet was too difficult to adhere to.

Drop-outs

“The reasons for dropout were: an injury or illness not related to the intervention (HC n = 7; LCKD n = 9), intervention too time consuming (HC n = 1; LCKD n = 1), dietary intervention too difficult to adhere to (LCKD n = 5), participants unable to complete post-intervention testing (LCKD n = 2), strength and HIIT training too difficult to incorporate into training week (HC n = 1), and technical difficulty at post-intervention testing (LCKD n = 1).”

Strengths of the study

The primary strength of this study is its duration. 12 weeks is impressive compared to the rest of the literature. Furthermore, they recruited 20 endurance athletes to partake in the study, which is also solid. The researchers measured BHB levels which is good because many keto studies do not do this, leaving us to rely on self-reported carb intake. In this study, both carbs eaten and BHB levels indicate that the participants were in ketosis. VO2, different indices of power, and time trial performance were all measured.

The researchers did not ask participants to reintroduce carbs during the trial or before the last performance tests, which other keto studies have done previously.

The study shows, alongside other studies, that an ad-libitum ketogenic diet leads to fat loss, most likely because participants naturally eat less without being asked to.

Problems and limitations in the study

Absolute vs. relative changes

Perhaps the biggest issue in this study is data reporting. Almost every performance outcome is reported as a relative value (performance per kilo of body weight). If you want to see the absolute values, check out the tables above.

Why is it an issue when only relative values are reported in a study? In this case, the keto group lost body weight. Weight loss leads to an automatic increase in relative performance, because the performance is standardized relative to body weight. This could mislead us into thinking that absolute improvements occurred when they did not. Indeed, we see this in the performance tables provided above.

An example:

The keto group increased their 6 second sprint average power by +0.5 watt/kg (relative), yet their absolute average power decreased by -32 watt. High carb increased by +0.3 w/kg and +13 watt.

This does not imply that relative performance values are useless, but researchers should always report both relative and absolute values. Only reporting relative values can be misleading.

Self-selecting groups and the importance of randomization

The subjects were allowed to select their own diet group in this study. This was done to promote adherence to the diet.

This may seem like a good reason, because only 20 out of 47 athletes finished the study, meaning that over half of the participants dropped out from both of the diet groups. How much worse would adherence be if subjects could not select their own diet?

While this may seem like solid logic, it has some issues:

One: If only the most dedicated and motivated athletes can stick to a diet, then it is not a good dietary choice for athletes in general.

Two: Randomization is very important because it prevents bias by equalizing groups at baseline. In this study, there are several variables that are not similar at baseline. And those are only the variables that are measured, think about all the UNmeasured confounding variables that can be eliminated with randomization! This affects the outcomes of the study because the comparison is not fair; one does not compare an elephant to a mouse.

“One critical component of clinical trials is random assignment of participants into groups. Randomizing participants helps remove the effect of extraneous variables (...) and minimizes bias associated with treatment assignment. Randomization is considered by most researchers to be the optimal approach for participant assignment in clinical trials because it strengthens the results and data interpretation.4–,9” - Kang et al., 2008

Randomized clinical trials are the gold standard for evaluating the efficacy of therapies or comparing one therapy with another.1 When applied in adequately powered trials, randomization eliminates selection and other forms of bias, generates groups under study that are alike in all important aspects (except for the intervention received), and avoids confounding by measured and unmeasured confounding variables. By design, randomized clinical trials have high internal validity, with the ability to determine cause-effect relationships. Fonarow, 2016

Losing body weight while increasing calories?

As is typical in ketogenic studies, the keto group lost body weight even if they weren’t asked to restrict calories. However, it is unclear how this happened, because both groups ate about ~200 kcal more compared to their habitual diet. High carb only lost a little bit of body weight (-0.8 kg), while keto lost much more (-5.9 kg). In absolute terms, keto ate about 400 kcal more than the high carb group.

Either the groups misreported their caloric intake, or something else is going on. From the study:

“Calorie intake in each group slightly increased during the trial, which contradicts how weight loss occurred. Weight loss is attributed to increased energy output, due to added training. Added weight loss within LCKD participants could be due to slightly greater volume of training undertaken each week.”

Indeed, perhaps the weight loss was due to greater training volumes. This might be true, however if we assume this to be true, we must also accept that keto exercised more than high carb. Greater training volume could then act as a confounder (assuming that more training = better performance).

Furthermore, the weight loss affects the relative performance increases (VO2max/power per kilo of body weight). This is discussed in-depth further down in this section.

Misciting the literature: protein inhibits ketosis?

In the discussion, the authors claim that “studies (...) with higher protein intakes fails to adequately induce nutritional ketosis, and may hinder adaptation”. To support this statement, they cite Zinn al., and Paoli et al. Yet, according to the Zinn study “The athletes varied little from the sample diets provided to them throughout the 10 weeks, with adherence to the diets verified by blood ketones always staying above 0.5 mmol/l from week 2 onwards.”

In paoli et al., 2012, no ketone measurements were done. So it begs the question how protein would prevent ketosis.

We have previously scoured the literature for how protein affects ketosis. The data we found does not suggest that high protein intakes prevent ketosis. Read more in our keto review:

More protein for the keto group

“Protein intake was significantly greater post intervention in [keto] group compared to the [high carb] group (P = 0.010 ).”

The keto group ate ~40 grams more protein than the high carb group. Protein is a macronutrient that affects:

This difference is not only statistically significant, but also clinically significant / practically important. The protein acts as a confounder for the outcomes, because the high protein group will more likely be satiated (hence eat less), expend slightly more energy, and gain more lean mass.

Baseline differences in body weight and caloric intake

According to the authors, “Energy intake remained unchanged in each group”, yet they reached this conclusion via a statistical significance test, which is an inferential statistic. Both groups increased their energy intake by about 200 kcal (descriptive statistic). High carb ate about 400 kcal less than keto (see the diet table above for details).

The keto group weighed, on average, 86.3 kg at baseline while high carb weighed 76.5 kg. This is a 10 kg difference in body weight. These inequalities occur because of a lack of randomization.

Statistical significance tests at baseline

Many studies use statistical significance tests to see if groups were different at baseline. The reasoning seems to go like this:

  1. We want to test for baseline differences because big differences could affect outcomes
  2. if P>0.05 then groups were not different at baseline
  3. if P<0.05 then groups were different at baseline
  4. rinse and repeat for every variable of interest

Problems with this approach:

  • A non-significant difference at baseline can still affect the outcome. Data reported at baseline are descriptive statistics. A 200 kcal energy deficit or surplus will affect the subjects regardless of the change in energy intake was significant or not. The same applies to body weight, strength, performance, etc. Even variables that are typically not measured, like motivation, can affect the outcome.
  • What does the baseline significance test (an inferential statistic) actually tell us?

“non-significant imbalances can exert a strong influence on the observed result of the trial” - Altman, 1985

“The arguments most often used to substantiate the choice for statistical testing for baseline differences are that one needs to examine whether randomization was successful and that one needs to assess whether observed differences in baseline characteristics are ‘real’ or ‘important’. To test whether randomization was successful is quite problematic and, more importantly, not necessary” - de Boer et al., 2014

“there have been numerous papers on the subject (...) in which significance testing of baseline differences has been discouraged or condemned [3-7].” - de Boer et al., 2014

Although proper random assignment prevents selection bias, it does not guarantee that the groups are equivalent at baseline. Any differences in baseline characteristics are, however, the result of chance rather than bias [32]. The study groups should be compared at baseline for important demographic and clinical characteristics so that readers can assess how similar they were. Baseline data are especially valuable for outcomes that can also be measured at the start of the trial (such as blood pressure)” - Moher et al., 2010 (CONSORT guidelines)

“Unfortunately significance tests of baseline differences are still common [[23], [32], [210]]; they were reported in half of 50 RCTs trials published in leading general journals in 1997 [183]. Such significance tests assess the probability that observed baseline differences could have occurred by chance; however, we already know that any differences are caused by chance. Tests of baseline differences are not necessarily wrong, just illogical [211]. Such hypothesis testing is superfluous and can mislead investigators and their readers. Rather, comparisons at baseline should be based on consideration of the prognostic strength of the variables measured and the size of any chance imbalances that have occurred [211].” - Moher et al., 2010 (CONSORT guidelines)

“With few exceptions, the statistical literature is uniform in its agreement on the inappropriateness of using hypothesis testing to compare the distribution of baseline covariates between treated and untreated subjects in RCTs [1,3e6]. Senn writes that, in an RCT, ‘‘over all the randomizations the groups are balanced; and that for a particular randomization they are unbalanced’’ [1]. Thus, in an RCT, the only reason to use a significance test would be to examine the process of randomization itself. As Begg suggests, ‘‘a significance test of the association between the covariate and the treatment assignment is a test of the hypothesis that the treatments are randomly distributed. In other words, it is a test of a null hypothesis that is known to be true’’ [3]. Similarly, Altman writes that ‘‘performing a significance test to compare baseline variables is to assess the probability of something having occurred by chance when we know that it did occur by chance. Such a procedure is clearly absurd’’ [4]. Although randomization will, on average, balance covariates between treated and untreated subjects, it need not do so in any particular randomization.” - Austin et al., 2009

The performance tests were likely easier for the keto group

The keto group lost body weight, and as such, their bike resistance was lowered at the end of the trial:

“a relative load of 0.5 of air resistance was applied for every 5kgs of body weight”.

This likely affected their time trial performance and possibly the other outcomes.

Since the high carb group did not lose much weight, their load was similar at week 1 and 12.

The title does not reflect the data

Due to the issues mentioned in this critical analysis, the title does not accurately reflect the findings: “Keto-adaptation enhances exercise performance and body composition responses to training in endurance athletes”

Suggested title: “Weight loss leads to relative, but not absolute improvements in 2/6 performance-related variables in ketogenic endurance athletes in a nonrandomized trial”

Conflicts of interest

Jeff S. Volek (Ph.D., R.D)

  • Sells: carb drinks (Superstarch): “SuperStarch is a complex carbohydrate (derived from non-GMO corn) that uniquely stabilizes blood sugar and causes virtually no reaction from the fat-storage hormone insulin. It's backed by proven science. Finally there's a healthier, more efficient energy source than sugars, caffeine, or high-carb meals.”

Volek founded "KetoThrive" (aka KetoThrive Consulting). The trademark was filed on May 14th, 2014 for use with a medical consulting firm. His diet/keto/T2D/etc. papers that he has authored since that date in 2014 should have listed the fact that he founded and owns a keto consulting firm as a financial conflict. His most relevant degrees are an R.D. and a B.S. in dietetics. His advanced degrees are both related to exercise and kinesiology.

KetoThrive eventually became VirtaHealth for which Dr.Volek is listed as both a founder and CSO. VirtaHealth is “an online specialty medical clinic that reverses type 2 diabetes without medications or surgery". The clinic has funded some of Volek’s research, and in this 2017 study, the COIs of his co-authors and himself were listed.

More on VirtaHealth:

Dr. Volek has authored five books (with the earliest LC/Keto-specific title showing a publication date of March 2010); at least three of his books specifically deal with low carb or ketogenic dieting, one involves carbohydrate timing, and all at least deal with his theories on carbs. Volek has promoted some of his books in his scientific papers.

Volek's lab at Ohio State University lists him as having procured $7MM in research grants. He is or has been a consultant for Atkins Nutritionals, Metagenics,and UCAN (hence the Superstarch patent below). He has also received grants from the National Dairy Council and the Malaysian Palm Oil Board. He has had studies funded by or been an author on studies funded by the Dairy Research Institute, the Dr. Robert C. Atkins Foundation (since at least 2004), the California Grape Commission, and the American Egg Board.

Patents:

  • Physiogenomic method for predicting response to diet (US20070196841A1)
  • Heat moisture treated carbohydrates and uses thereof (Application US20110287131A1)

Other points of discussion

The researchers chose to use the Wattbike and the critical power test. Some have pointed out that this is a non-standard test and that it might be better to use a more standard test for power, such as Wingate. Though, the Wattbike seems to be in common use (Kell and Greer, 2017) and some suggest it is a reliable and valid tool (Driller et al., 2013; Driller et al., 2014; Wainwright et al., 2017). There might be reliability issues at low power settings (Hopker et al., 2010). 

"Critical power (CP) represents the highest work rate for which a metabolic steady state is attainable" (Goulding et al., 2017).  Critical power is used by multiple research teams (Silveira et al., 2017; Vinetti et al., 2017; Griffin et al., 2017; Goulding et al., 2017; Karsten et al., 2017). Regarding validity, some research suggests that the 3-min all-out linear cycling test does not provide a good estimate of critical power, yet the isokinetic test may be more valid (Wright et al., 2017). Though some disagree regarding the isokinetic mode (Karsten et al., 2014) Some question how accurately measurements of critical power reflect the maximal steady state (Maturana et al., 2016). Others find that the 3-minute test can be reliable and valid (Vanhatalo et al., 2007Bergstrom et al., 2012Vanhatalo et al., 2008Karsten et al., 2015; Jones and Vanhatalo, 2017) given proper familiarization (Simpson et al., 2017). It should be noted that critical power and anaerobic work capacity are influenced by lean body mass and total body mass (Byrd et al., 2017). This highlights the importance of matching body composition and controlling weight loss during trials. How long subjects rest between CP tests may also influence the outcomes (Karsten et al., 2017)

"The curvilinear relationship between power output and the time for which it can be sustained is a fundamental and well-known feature of high-intensity exercise performance. This relationship 'levels off' at a 'critical power' (CP) that separates power outputs that can be sustained with stable values of, for example, muscle phosphocreatine, blood lactate, and pulmonary oxygen uptake ([Formula: see text]), from power outputs where these variables change continuously with time until their respective minimum and maximum values are reached and exercise intolerance occurs. The amount of work that can be done during exercise above CP (the so-called W') is constant but may be utilized at different rates depending on the proximity of the exercise power output to CP." - Jones et al., 2017

"The overestimation of ramp incremental performance suggests that the CP and W' derived from different work-rate forcing functions, thus resulting in different VO2 kinetics, cannot be used interchangeably. The present findings highlight a potential source of error in performance prediction that is of importance to both researchers and applied practitioners." - Black et al., 2016

More on the critical power test:

Beyond this, several of the performance outcomes were unchanged when comparing baseline values to post values. One could question how you can train athletes for 12 weeks and see little change in performance. Thanks for Chad Macias for pointing this out. Perhaps it was the added training load (they added resistance training to the intervention)?