How the AI thinks.
Six stages, hairline-connected. Cited from USDA SR-Legacy and ADA Exchange. Trained for 22 months on tens of thousands of medically-referenced meals.
Six stages. One photograph.
Each stage is a discrete model with a published failure rate. None operates as a black box.
- 01 Segmentation Isolate every dish on the plate as its own independent region.
- 02 Identification Match each region against a medically-referenced food database.
- 03 Portion Estimate weight in grams from on-plate scale cues.
- 04 Macros Resolve carbohydrates, fat, and protein per 100 g — then scale to portion.
- 05 Glycemic load Cross-reference to the Sydney University GI Database, sum across dishes.
- 06 Citation Attach the exact source entry to every number — one tap away in the UI.
Find every dish on the plate.
Before any food can be weighed or cross-referenced, the model must answer a simpler but deceptively hard question: where does one dish end and the next begin? Segmentation is the stage that draws that boundary, and it runs before anything else in the pipeline.
The underlying architecture is a transformer fine-tuned on approximately 40,000 annotated plate images — meals photographed across lighting conditions ranging from restaurant candlelight to outdoor mid-day sun. Each image was hand-labelled by a team of annotators with training in food service and clinical nutrition. The model learned not just what a dish looks like in isolation, but how dishes relate to each other spatially: the way a pool of curry intrudes on a rice border, the way a garnish belongs to the dish beneath it, the way the sauce of one preparation bleeds into the negative space occupied by another.
Failure modes here are real and acknowledged. The first and most common is overlapping foods — a pile of salad draped over the edge of a protein portion, or a spoonful of condiment shared between two dishes. When the model cannot cleanly assign a pixel region to a single dish, it returns a lower confidence score rather than forcing a hard boundary. The second failure mode is glassware: a glass of juice in the frame reads to the segmentation layer as a translucent object of uncertain content, and is typically flagged with low confidence or excluded from the segmentation map entirely. The third is stews and mixed preparations — biryani, curry, congee — where the visual boundary between dish and broth does not exist in a geometric sense.
When confidence is low, CalEye does not fabricate a clean output. The uncertainty is surfaced in the UI: the dish region is shown with a dotted rather than solid boundary indicator, and the word "Estimated" appears next to any number derived from a low-confidence segment. The decision to make uncertainty visible was deliberate, and came directly from clinical feedback in our early testing phase. Diabetics especially need to know when a number is reliable and when it is a best-available approximation.
Match each region to a medically-referenced food.
Once a region is segmented, identification maps it to a specific entry in a curated food database — not a generic label, but a specific, traceable food record with a glycemic index value, a macro profile per 100 grams, and a citation to a primary source. The database currently contains approximately 12,000 entries drawn from USDA SR-Legacy, the ADA Exchange List, the Indian Council of Medical Research nutrient atlas, and the NIN Hyderabad food composition tables.
The distinction between generic and specific identification matters more than it might seem. The model does not return "rice" — it returns one of white rice, brown rice, parboiled rice, basmati, jasmine, arborio, or any of nineteen other rice preparations with meaningfully different glycemic indices. White basmati, for instance, has a GI of approximately 58. Jasmine rice runs closer to 109. A system that collapses both into "rice" and applies a midpoint GI is not making a reasonable approximation — it is making a clinically meaningful error, especially for the patient who has been told by their endocrinologist to avoid high-GI grains.
Western-only food databases fail silently for global users. A database built from American and European meal data has no entry for idli, dhokla, poha, jalebi, bibimbap, gado-gado, or feijoada. When such a database encounters one of these preparations, it either returns nothing or returns the closest Western approximation — a failure that is invisible to the user and potentially significant in glycemic terms. CalEye's database includes a multilingual food set spanning Indian subcontinent cuisines, East Asian preparations, Mediterranean dishes, and Latin American staples. Coverage is not complete — no 12,000-entry database can be — but the failure mode for out-of-set foods is explicit rather than silent.
When the model cannot match a region with sufficient confidence, the UI displays a "Closest match" disclosure: the identified food, its confidence percentage, and a prompt to correct or confirm. This is preferable to the silent guess — which is how every lookup-based competitor currently handles the same situation.
The hard problem: how much.
If segmentation is the hardest geometric problem in the pipeline and identification is the hardest knowledge problem, portion estimation is the hardest measurement problem. A two-dimensional photograph contains no depth information. The model is being asked to reason about a three-dimensional volume of food — its height above the plate, the density of its packing, the way it spreads versus mounds — from a flat projection. This is the stage that introduces the most variance in our output numbers, and we think it is important to say so directly.
The approach relies on scale cues embedded in the image. Every photograph of a meal contains objects whose real-world dimensions are known with reasonable confidence: the diameter of a standard dinner plate, the length of a fork, the height of a beverage glass, the dimensions of a standard takeaway container. The model has been trained to identify these objects and use them as metric anchors. When a plate is identified in the frame, its diameter — typically between 24 and 30 centimetres for a dinner plate — becomes the reference unit against which food dimensions are estimated.
Plate-edge detection is therefore not incidental to the portion stage — it is the foundational measurement step. The model identifies the plate boundary, estimates its diameter from the image perspective, and uses that diameter to scale all food dimensions detected within the frame. From food dimensions, it estimates volume using preparation-type density priors: the density of cooked white rice, for instance, is approximately 0.75 g/cm³ in a loosely-packed serving. These priors are calibrated from physical measurements taken during the training data collection phase.
The output is grams with a confidence interval — not a point estimate. The UI displays the central estimate, and tapping any macro number surfaces the full interval. A meal that resolves cleanly — a single portion of chicken breast beside a scoop of rice on a round white plate, photographed from above in good light — will show an interval of roughly ±8%. A mixed restaurant plate in low light will show ±20% or wider, and the interface labels it accordingly.
One design decision that generated internal debate: when a stacked or layered food is present — a sandwich, a burger, a multi-layer casserole — the model tends to over-estimate slightly. We have deliberately not corrected this bias for our diabetic user base. For a person managing post-prandial blood glucose, an over-estimate of glycemic load that triggers a small conservative adjustment is less harmful than an under-estimate that lets a spike go unmanaged. The bias is disclosed in our accuracy documentation, and it is not applied for users who have indicated a weight-management rather than glycemic-management context.
When no reliable scale cue is visible in the frame — close-up shots of bowls with no surrounding context — the model prompts the user to retake with more of the table in frame. We do not produce a portion estimate without a credible scale anchor.
Carbs, fat, protein — per 100 g.
Given the portion in grams from stage 3 and the food identity from stage 2, computing macros is in principle a straightforward multiplication. The per-100 g macro profile of the identified food — drawn from the USDA SR-Legacy or the equivalent regional source — is scaled by the estimated portion weight. The result is carbohydrates, fat, and protein in grams for the actual portion on the plate.
The reason we anchor on per-100 g values rather than per-serving values is worth explaining, because it is a design choice that runs counter to how nutrition labelling works in most jurisdictions. Serving sizes on packaged food vary between 4x and 8x across product categories — a "serving" of breakfast cereal is 30 g; a "serving" of pasta may be listed as 85 g dry or 190 g cooked. Neither figure is the portion that landed in your bowl. Per-100 g values are a stable, food-intrinsic measure that does not change with the serving size fiction printed on a label.
Calories are derived, not separately predicted. The 4-4-9 rule — 4 kcal per gram of carbohydrate, 4 kcal per gram of protein, 9 kcal per gram of fat — is applied to the resolved macro figures to produce the calorie total. This is how USDA itself computes Atwater-method calories for the SR-Legacy entries. Running a separate calorie prediction model would introduce a second source of variance without improving accuracy.
One quality check runs at this stage that can trigger a pipeline branch: if the protein-to-fat ratio for the identified food is implausible given the visual evidence — a portion the model has identified as chicken breast but where the protein-to-fat ratio resolves outside the range consistent with any preparation of chicken — the system initiates a partial re-segmentation pass focused on that region. This catches misidentification errors that the identification stage's confidence score alone might not surface.
Cross-reference to the glycemic-index table.
Glycemic load is computed from the carbohydrate figure produced in stage 4 and the glycemic index value associated with the identified food. The formula is: GL = (carbohydrates in grams × GI) / 100. This is the standard calculation used in the clinical literature, adopted from the work of Jenkins et al. and subsequently validated in meta-analyses published in the American Journal of Clinical Nutrition. For a plain-language explanation of why glycemic load is more useful than glycemic index alone, see Glycemic load vs glycemic index — the one that matters.
GI values are drawn from the University of Sydney's GI Database — the most comprehensive published source of peer-reviewed GI values. Where multiple GI values exist for a single food preparation (which is common — GI varies with cooking method, ripeness, and even cooling time for certain starches), the model applies a preparation-method prior: if the segmentation and identification stages have resolved the preparation type with sufficient confidence, the GI appropriate to that preparation is used. If not, a conservative upper-bound value is applied, consistent with the diabetic-user bias described in the portion stage notes.
The UI surfaces GL rather than GI because GL is the clinically actionable number. GI measures the speed at which a food raises blood sugar in a fasting state, referenced to a fixed 50 g carbohydrate portion. GL measures the actual blood-sugar impact of the food as consumed, at the portion size actually eaten. A watermelon has a high GI (72) but a low GL per standard serving (approximately 5) because a standard serving contains very few carbohydrates. For mixed meals — which is most meals — the per-dish GL values are summed to produce a meal-level GL figure.
Every number, traceable.
The final stage is not a computation — it is an accountability step. Every number CalEye returns is attached to the exact database entry that produced it: the USDA SR-Legacy food code, the ADA Exchange List category, the Sydney University GI record. Tapping any number in the CalEye UI surfaces a modal with the food name as it appears in the source database, the per-100 g macro profile, the GI value and its source citation, and the confidence score for each stage of the pipeline.
Other AI nutrition applications return numbers without provenance. We believe this is the disqualifying failure mode for diabetic use. A person managing blood glucose cannot act on a carbohydrate figure whose lineage they cannot inspect. When a number from a black-box model disagrees with a clinician's expectation, there is no path to resolution — no way to know whether the error is in the AI's food identification, its portion estimation, its macro lookup, or its GI table. Citation is the mechanism that makes any disagreement resolvable, and makes the system auditable by the clinician, not just the patient.
What we get right. And what we don't.
The honest account of CalEye's accuracy is not a single headline figure — it is a distribution across meal types, lighting conditions, and photo angles. Here is what the internal test data shows, with the methodology visible so clinicians can evaluate it.
For single-dish photographs taken in reasonable ambient light with a visible plate edge — the conditions where all six pipeline stages can operate close to their design parameters — our internal test set of 1,200 plates shows a median error of ±8% on total carbohydrates. This covers a wide range of food types including Western, South Asian, and East Asian preparations. The test set was not drawn from the training data and was evaluated against manually weighed and lab-measured reference values.
For mixed restaurant plates — multiple dishes, sauces overlapping, low or mixed light, no clear plate edge, portions plated decoratively rather than practically — error climbs to ±15–20%. We publish this figure because we think under-reporting it would be the category of dishonesty that makes AI medical tools dangerous.
Where we lose accuracy most consistently: deep-fried foods, where oil absorption after cooking is invisible to the camera and can represent 20-40% of the caloric content of the item; stews and braises, where hidden ingredients — ghee, coconut milk, sugar added during cooking — are undetectable from visual inspection; servings above approximately 500 g, where the portion model's scale estimation becomes less reliable at large volumes; and composite dishes like biryani, paella, or lasagne, where multiple preparation steps and hidden fats have been physically integrated into a dish that reads as a single region.
Where we are confident: simple plates with clear ingredient boundaries, single-ingredient preparations (a grilled fish fillet, a bowl of oats, a piece of fruit), and packaged snacks where the barcode can be used as a cross-check against the visual estimate. For these categories, the ±8% figure holds across lighting conditions, and in our clinical partner evaluation, no endocrinologist reviewing the output flagged a number as clinically unacceptable for decision-support use.
A final and important caveat: CalEye is not a medical device and is not regulated as one. The numbers it returns are decision-support inputs — the same category as a food diary or a nutrition label — and are not a replacement for the clinician-prescribed insulin protocol, the ADA Exchange List exercise your dietitian has designed for you, or the regular A1C monitoring your endocrinologist has recommended. Our aim is to make the inputs to those clinical decisions more accurate than the mental estimates most people currently rely on. That is a meaningful improvement, and it is a more honest claim than those made by apps that position AI nutrition as a clinical replacement.
Common questions.
- How is this different from MyFitnessPal's photo feature?
- MyFitnessPal's photo-to-food feature uses the photograph as a search shortcut: the image is classified into a category, and the result is a lookup against MFP's crowdsourced food database. The photograph is the interface, not the measurement. CalEye uses the photograph as the primary nutritional input: portion size is derived from the visual geometry of the image itself, not from a default serving size associated with a database entry. The practical difference is largest for home-cooked and restaurant meals, where no database entry reflects what was actually served. MFP's approach works well for packaged foods with reliable label data. Ours works for the other 70% of meals.
- Does it work offline?
- Yes. The core models — segmentation, identification, portion, and macro computation — run on-device. No network connection is required to photograph a meal and receive carbohydrate, calorie, and glycemic load figures. Citation links, which connect a number to its source entry in USDA SR-Legacy or the Sydney GI Database, require a network connection for the first view; once retrieved, the source record is cached locally and available offline for subsequent views. The on-device model is updated via a background refresh when a connection is available, consistent with the update schedule published in the app's settings screen.
- Can I trust the carb count enough to dose insulin?
- Not as a replacement for your clinician's insulin protocol — but yes as a more accurate input than the mental estimate most people currently use for dose calculation. The ADA's position is that carb counting improves glycemic control when the carb estimates are reasonably accurate. CalEye's ±8% accuracy on simple plates is substantially better than the estimated ±30–50% typical of unassisted mental estimation. For mixed restaurant plates, the ±15–20% figure is closer to the unassisted baseline, and we label those results accordingly. Always cross-check with your endocrinologist's formula and the correction factors in your personalised protocol. We are a decision-support tool, not a dosing calculator.
- Why don't you give a single calorie number? Where is the confidence interval?
- We do give a confidence interval — it is one tap away. The primary UI surface shows the point estimate because most users want a single actionable number at a glance, and adding the interval to every display element would make the interface harder to read at the moment of use. Tapping any calorie, carbohydrate, or GL number surfaces a detail panel showing the full confidence interval, the contributing stage confidence scores, and the source citation. Marketing copy for CalEye shows the point estimate in headlines; the interval is always present in the product and is the number we recommend for clinical use. If you are using CalEye figures in a clinical context, always use the interval, not the headline.
Every science post in the CalEye archive.
44 peer-reviewed posts on the science behind food recognition, calorie measurement, and AI nutrition.
AI vision & food recognition
- AI vs Registered Dietitian — Accuracy Comparison Studies How does AI food recognition compare to dietitian estimates? We review published accuracy studies, controlled benchmarks, and where each approach has the edge.
- Food Image Segmentation — The Medical-Grade Challenge Pixel-level food segmentation borrows techniques from radiology AI. This is why identifying meal boundaries is harder than it looks — and what research shows.
- How AI Sees Food — Convolutional Vision for Nutrition Convolutional neural networks now identify food from a single photo. Here is the signal-processing pipeline that turns pixels into macronutrient estimates.
- The Future of Nutrition AI — What the Next Decade Will Resolve From multimodal LLMs to continuous metabolomics, nutrition AI is at an inflection point. These are the five open problems research is poised to resolve by 2035.
Calorie science & measurement
- Active Calories vs Total Calories: Why Your Watch Shows Two Numbers NEAT, EAT, and BMR unpacked — why the difference between active and total calories matters for setting an accurate deficit.
- Bomb Calorimetry — How Calorie Counts Are Measured in a Lab Food is combusted in a pressurized oxygen chamber to measure gross energy. Here is the bomb calorimetry protocol and why metabolic energy differs from gross.
- Calories in Chicken Breast (Cooked, Raw, Per 100g) How many calories in chicken breast? Cooked skinless is ~165 cal per 100g with 31g protein. Full breakdown by portion, cooked vs raw.
- Do Resting Calories Count Toward Your Deficit? The TDEE Truth Clarifying BMR, NEAT, and active burn — which numbers to include when calculating a real daily deficit. Real research, real numbers, and a clear answer you can
- How Calories Are Measured: From Bomb Calorimetry to Food Labels The full chain from laboratory combustion to Atwater factors to the number on your nutrition panel — including the systematic ±10% error built into every food.
- How Many Calories Should You Burn Daily? It Depends on This TDEE components, activity multipliers, and why a fixed daily burn target is less useful than a weekly average — a plain-language guide to understanding energy.
- Most Accurate Ways to Measure Calorie Burn: Ranked by Error Rate Doubly-labelled water, metabolic carts, wearables, and MET formulas — error rates and use cases compared for researchers and calorie-conscious individuals.
- The 4-4-9 Rule — Where It Breaks Down The 4-4-9 Atwater shorthand is universal but systematically wrong for fiber, alcohol, sugar alcohols, and novel proteins. Here is where the rule breaks down.
- The Atwater Factors — Why Modern Science Adjusts Them Atwater's 4-4-9 factors date from 1899. Modern digestibility studies and fiber science have driven FDA, FAO, and WHO to revise the energy conversion system.
- Why Per-100g Is the Only Stable Macro Reference Per-serving figures shift with labeling rules. Per-100g is the one stable anchor for comparing foods and tracking macros accurately across all databases.
Glycemic science & glucose
- CGM Accuracy — The Engineering Limits CGMs measure interstitial fluid, not blood, introducing physiological lag and calibration limits. Here is the accuracy science behind consumer glucose sensors.
- Glucose vs Fructose — Different Metabolic Pathways Glucose and fructose share a molecular formula but follow different metabolic routes. Here is the biochemistry, liver load, and why the sugar source matters.
- Glycemic load vs glycemic index — the one that matters Why glycemic load is the more useful metric for diabetic meal planning than glycemic index, and how AI photo analysis surfaces it.
- Portion Size from 2D Photos — The Depth Problem Estimating food volume from a flat image requires depth reconstruction. Here is the geometry, the error ranges, and what current AI models do to compensate.
- Postprandial Glucose Response — Individual Variability Research The same meal can spike glucose 2-3x differently across people. The science of inter-individual glycemic variability and what it means for nutrition apps.
- Resistant Starch — The Carb Your Gut Digests Differently Resistant starch escapes small-intestine digestion and feeds gut bacteria. The science of RS types, fermentation, short-chain fatty acids, and glycemic impact.
- The Microbiome's Role in Carb Metabolism Gut bacteria ferment undigested carbs and produce short-chain fatty acids that alter glycemic response. The science of microbiome-carbohydrate interactions.
- The PREDICT Studies — Personalized Nutrition Data PREDICT 1 and 2 are the largest personalized nutrition studies run. Here is the methodology, findings, and what they reveal about individual dietary response.
- The Sydney GI Database — Methodology Explained The Sydney GI Database is the global gold standard for glycemic index values. Here is how foods are tested, averaged, and why values vary between labs.
Macronutrient science
- Body Recomp Protein Targets for Women Over 40: The Higher Bar Why older women need more protein per kg than standard guidelines, and how to hit it without excessive calories. Peer-reviewed evidence and worked examples, dis
- Dietary Fat Doesn't Make You Fat — Here's What the Studies Say A myth-busting walkthrough of fat metabolism, adipogenesis, and why calorie surplus, not fat grams, drives fat storage. The science, the numbers, and what actua
- Does Collagen Count as Protein? The Amino Acid Gap Explained Collagen's missing tryptophan and low leucine make it poor for MPS — here's how to correctly log it without over-counting.
- Does Extra Protein Convert to Carbs or Fat? The Metabolic Truth Gluconeogenesis explained: when the body converts protein to glucose, and why it rarely contributes to fat gain. Real research, real numbers, and a clear answer
- Does Tracking Macros Actually Work Long-Term? The Studies Say... A review of adherence, body-composition, and psychological outcomes in macro-tracking research over 6–24 months. A practical guide with the studies, sources, an
- Gaining Muscle While Losing Fat: The 3 Conditions That Must Be Met Caloric partitioning, training stimulus, and protein timing — the precise conditions under which recomp actually happens.
- The Thermic Effect of Food — Protein's Metabolic Advantage Digesting protein burns 20–30% of its energy as heat. Here is the biochemistry of diet-induced thermogenesis and why protein has the highest thermic effect.
- Why Fiber Doesn't Count as Carbs — The Net-Carb Derivation Net carbs subtract fiber from total carbohydrates, but the biochemistry is nuanced. Here is the science of digestibility, fermentation, and the regulatory math.
Food composition & databases
- Carnivore Diet Protein Intake: How Much Is Too Much? Gluconeogenesis risk, kidney load, and satiety data — the protein range that works without crowding out fat on carnivore.
- USDA SR-Legacy — What's in the Database Your App Uses USDA SR-Legacy release 28 underpins almost every nutrition app. Here is what it contains, how values are measured, and where the gaps are for global cuisines.
- Why Cooking Method Changes Nutritional Values Boiling, roasting, steaming, and frying each alter macronutrients and GI. Here is the nutritional science of how cooking method transforms food composition.
- Zone Diet Evidence Review: What the Data Actually Shows A critical look at peer-reviewed studies on the 40-30-30 Zone Diet — benefits, pitfalls, and who it actually works for. Peer-reviewed evidence and worked exampl
Other science
- Calories in a Banana: By Size, With Macros How many calories in a banana? A medium banana has about 105 calories. Full breakdown by size, plus carbs, fiber, and how it fits your day.
- Calories in an Egg: Whole, White, and Yolk How many calories in an egg? A large egg has about 72 calories and 6g protein. Full breakdown by size, plus egg white vs yolk and cooking.
- Calories in Rice: White, Brown, and Cooked Portions How many calories in rice? Cooked white rice is ~130 cal per 100g (~205 per cup). Full breakdown of white vs brown and real portions.
- Calories in Roti (Chapati): By Size, With Macros How many calories in a roti? A medium plain chapati has about 110 calories. Breakdown by size, with and without ghee, plus carbs and protein.
- Does the Calorie-Counting Diet Work? What the Evidence Says The calorie-counting diet is the most direct way to lose fat — but it isn't for everyone. Here's what the evidence shows and who should skip it.
- How Many Calories Do Competitive Bodybuilders Actually Eat? Bulk-to-cut cycles, peak-week depletion, and the per-kg calorie ranges used at different competitive levels — a science-first breakdown.
- How Many Calories Per Day Do You Actually Need? How many calories per day you need depends on sex, age, weight, and activity. Here are the real ranges and how to calculate your own number.
- How to Find the Calories in Any Food (3 Reliable Ways) Need the calories in a food? Here are three reliable methods — nutrition labels, the USDA database, and photo estimation — and when each is best.
- What Are Maintenance Calories and How Do You Find Yours? Maintenance calories are the energy you burn in a day. Here's how to find yours with the Mifflin-St Jeor equation and real activity multipliers.